Step-by-Step Example
This page provides a detailed example of how to use NYT_Transformer. We’ll create a project called Ninja that takes a CSV file and loads it into a MySQL database.
The sample code used on this page is available in the source files (.tar or .zip).
Before stepping through this example, complete the instructions under Setup.
Data
The Ninja data consists of some basic real estate information (because even a ninja needs a place to crash). NYT_Transformer will parse the input data so it can be entered into a listings database table.
Sample Ninja report
"ID","LOCATION", "AVAILABLE", "PRICE" "101","Uptown", "False", "$50,000" "202","Downtown", "True", "$150,000" "303","Midtown", "False", "$1,000,000" "404","Suburb", "True", "$20,000"
Database table ninja_listings
| ID | LOCATION | AVAILABLE | PRICE | CLIENT_TYPE |
|---|---|---|---|---|
| 101 | Uptown | False | 50000 | NINJA |
| 202 | Downtown | True | 150000 | NINJA |
| 303 | Midtown | False | 1000000 | NINJA |
| 404 | Suburb | True | 20000 | NINJA |
Setting Up the Bin
In your NYT_Transformer project directory (any directory you choose), create a file called ninjaRun.php. This file will tell NYT_Transformer which files to include and what level of logging to use, and will run the NinjaReader file we’ll create later.
Sample ninjaRun.php file
1 <?php
2 /**
3 * Bin file for Ninja. This file performs
4 * the includes, sets the logging,
5 * and then executes the run
6 *
7 * @author Paul Robbins
8 * @version $Id: $
9 * @package Ninja_FL
10 */
11 require_once('NYTD/Feeds/Config.class.php');
12 require_once('Ninja/Config.class.php');
13 require_once('Ninja/NinjaReader.class.php');
14 require_once('Ninja/Filter/FormatPrice.class.php');
15
16 // Set up logging
17 NYTD_Feeds_Logger_Audit::getInstance()->openLog(Ninja_Config::AUDIT_LOG);
18 NYTD_Feeds_Logger_Audit::getInstance()->setLogLevelMask(Ninja_Config::getAuditLogLevel());
19 NYTD_Feeds_Logger_Error::getInstance()->openLog(Ninja_Config::ERROR_LOG);
20
21 // Start the run
22 $ninjaController = new Ninja_NinjaReader();
23 $ninjaController->run();
24
25 // Close logs
26 NYTD_Feeds_Logger_Audit::getInstance()->closeLog();
27 NYTD_Feeds_Logger_Error::getInstance()->closeLog();
28
29 ?>
- Lines 2 through 9 display the comments in PHPDoc format.
- Line 11 performs the include of the NYT_Transformer code and sets up NYT_Transformer variables.
- Lines 12 through 14 performs the includes of the project’s config file, main class and a localized filter. We’ll create all of these later.
- Lines 17 through 19 set up logging.
- Line 22 creates a new object of the project’s main class, and line 23 starts the run.
- Lines 26 and 27 close the logging.
In the next section, we’ll create the project’s main class.
Setting Up the Main Class
In this section, we’ll create the file NinjaReader.class.php. This is the manager class for the project.
Sample NinjaReader.class.php file
1 <?php
2 /**
3 * Class and functions used to read the Ninja Report and
4 * enter it into the DB
5 *
6 * @author Paul Robbins
7 * @version $Id: $
8 * @package Ninja_FL
9 */
10
11 class Ninja_NinjaReader extends NYTD_Feeds_Base {
12 /**
13 * Function to set input parameters and create an Input class object
14 *
15 * @return NYTD_Feeds_Input_CSV
16 */
17
18 protected static function getObjectInput() {
19 $inputParams = array(
20 'filename' => './raw/The_Ninja_Report.csv',
21 'first_record_is_column_list' => TRUE
22 );
23
24 return new NYTD_Feeds_Input_CSV($inputParams);
25 }
26
27 /**
28 * Function that manages the setting of the input, filter
29 * and output objects into the controller
30 *
31 * @return NYTD_Feeds_Controller
32 */
33 public function getNinjaReaderController() {
34 $input = $this->getObjectInput();
35
36 $price_filter = new Ninja_Filter_FormatPrice();
37
38 $filter_add_client_type = new NYTD_Feeds_Filter_FieldListModification(
39 array(
40 'add_list' => array(
41 'CLIENT_TYPE' => 'NINJA'
42 )
43 )
44 );
45
46 $output = parent::getObjectOutputMySQL(
47 array(
48 'table' => 'ninja_listings',
49 'sql_command' => 'PROC_CALL',
50 'use_smart_updates' => FALSE,
51 'track_deleted_rows' => FALSE,
52 'stored_proc_name' => 'fl_load_ninja_proc',
53 'batch_size' => 1
54 ),
55 Ninja_Config::getDBConnectionParams()
56 );
57
58 $controller = parent::getObjectController();
59 $controller->setInput( $input );
60 $controller->setFilter( $price_filter );
61 $controller->setFilter( $filter_add_client_type );
62 $controller->setOutput( $output);
63
64 return $controller;
65 }
66
67 /**
68 * Run function called by ninjaRun.php to start the processing of the file
69 * @return void
70 */
71 public function run() {
72 $controller = $this->getNinjaReaderController();
73 $controller->run();
74 }
75 }
76 ?>
- Line 11 creates the NinjaReader class, extending the NYT_Transformer base class. NinjaReader will handle the creation of all input and output objects.
- Line 18 is the beginning of the function that creates an Input object. We’ll use the NYT_Transformer input object for CSV files (one of the subclasses packaged with the code).
- Lines 19 through 22 set up the parameter list for the CSV input record. We tell it to read the file The_Ninja_Report.csv. The
first_record_is_column_listoption tells the input record that the first line in the CSV file contains the names of the columns. - Line 24 returns an Input_CSV object.
- Line 33 is the start of the main function of the NinjaReader class.
- Line 34 calls the getObjectInput function and sets the Input_CSV object that is returned to a local variable.
- Line 36 creates a new Filter_FormatPrice object and sets it to a local variable. We’ll define Filter_FormatPrice later.
- Lines 38 through 44 create a Filter_FieldListModification object. FieldListModification is an NYT_Transformer filter that lets you modify (add, delete or rename) fields in a record. In this case, we’re adding the field CLIENT_TYPE and setting the value to NINJA. This field will be added to every record that is read from the CSV file.
- Lines 46 through 56 set up a MySQL output object. The function
getObjectOutputMysqlis part of the NYT_Transformer base class. Since NinjaReader extends that class, we can just call the function with theparentcommand. - Line 48 specifies the table that we’ll be working with.
- Line 49 tells the output object that we’ll be calling a stored procedure. (Again, this is just an example; you might not need to use a stored procedure.)
- Lines 50 and 51 indicate that we’re not using smart updates or tracking deleted rows. (See the Advanced Functionality page and the PHPDoc for more information about these options.)
- Line 52 specifies the name of the stored procedure to call.
- Line 53 tells the output object to process only one record at a time. If you’re using a different
sql_commandoption, such asINSERT, you can process multiple records at a time by changingbatch_size. - Line 55 defines the database connection parameters by calling a static function in the configuration file (discussed under Setting Up the Config File.)
- Line 58 calls the NYT_Transformer base function
getObjectControllerto create the controller that will manage the entire run. - Line 59 tells the new controller to use the Input_CSV object we defined earlier.
- Lines 60 and 61 tell the controller that we’ll be applying filter objects to all records. (See Setting Up a Localized Filter.) The order of the records in these lines determines the order in which the filters are run.
- Line 62 tells the controller to use the Output_MySQL object we defined earlier.
- Line 64 returns the controller with the input, filters and output now defined.
- Line 71 is the beginning of the function that’s called by the bin file. (See Setting Up the Bin.)
- Line 72 calls the main function of this class and sets its return value to a local variable.
- Line 73 executes the controller function
run()to begin the entire NYT_Transformer process.
In the next section, we’ll set up the localized filter referenced in line 60 of NinjaReader.class.php.
Setting Up a Localized Filter
The FieldListModification filter in the NinjaReader class is an NYT_Transformer base filter (packaged with the source). It’s generic enough for most projects, but in some cases you may need to extend the base Filter class and write your own. Luckily, an NYT_Transformer record is well structured, so manipulating it is simple.
For this example, the directory /Filter will house the localized filter. Setting up this directory below the main class indicates that its files perform only one task, instead of managing the project. If your project requires a localized Input, Output or Controller, you can use this same logic and create a subdirectory under the main project directory.
Sample FormatPrice.class.php file
1 <?php
2 /**
3 * Filter class for manipulating the price field
4 *
5 * @author Paul Robbins
6 * @version $Id: $
7 * @package Ninja_FL
8 */
9 class Ninja_Filter_FormatPrice extends NYTD_Feeds_Filter {
10 public function __construct() { }
11
12 /** public function filter
13 * @param NYTD_Feeds_Record
14 * @return boolean success flag
15 */
16 public function filter( NYTD_Feeds_Record $record_obj ) {
17 feedsAuditLog( "filter() called", NYTD_Feeds_Config::RUNLEVEL_FILTERING, get_class($this) );
18
19 if ( is_object( $record_obj ) ) {
20 $price = $record_obj->getField('PRICE');
21 $price = str_replace(',', '', $price);
22 $price = str_replace('$', '', $price);
23 $record_obj->setField('PRICE', $price);
24
25 return 1;
26 }
27 else {
28 throw new NYTD_Feeds_InternalStateException( sprintf( 'USAGE: %s::filter( record_obj )', get_class($this) ) );
29 }
30 }
31 }
32 ?>
- Line 9 creates the class, called Ninja_Filter_FormatPrice. It extends the NYT_Transformer base Filter class and inherits many functions.
- Line 10 contains a blank constructor; we don’t need to pass any parameters into this particular filter.
- Line 16 is the beginning of the
filter()function. This function must exist for the localized controller because NYT_Transformer will call thefilter()function when a filter is set to the controller in the main class. The controller will pass in an NYT_Transformer record object, so we have to tell thefilterfunction to accept it. - Line 17 is a logging function (explained further in Setting Up the Config File).
- Line 19 verifies that what the controller sent to this filter is really an object.
- Line 20 calls the
getFieldfunction, which was inherited from the NYTD_Feeds_Filter class. This function retrieves the value of the record field with the namePRICE. We set it to a local price variable. - Line 21 removes any commas in the price. The original report contained values such as
$150,000, which we want to convert to150000. - Line 22 strips the dollar sign off the price.
- Line 23 calls the
setFieldfunction to set the record’sPRICEfield to the value of the local variable. This essentially replaces the oldPRICEvalue of$150,000with the new value of150000. - Line 25 returns a Boolean
1. The controller is waiting for a return value; without that, it won’t continue to its next step. By returning a1, the filter tells the controller that all went well and the process can continue. If a0is returned, the record will be discarded (and if that’s what you want to do, the better choice is the filter called DiscardUnwanted). - Line 28 throws an error if the controller sends the filter something other than a record.
In the final section, we’ll set up the config file.
Setting Up the Config File
In most cases, the config file will be fairly simple. This sample file defines a few variables and establishes the database information for the output object.
Sample Config.class.php file
1 <?php
2 /**
3 * Config class for Ninja
4 *
5 * @author Paul Robbins
6 * $Id: $
7 * @package Ninja_FL
8 */
9 class Ninja_Config
10 {
11
12 /**
13 * variable for project error log
14 * @var string
15 */
16 const ERROR_LOG = "./ninja_load.error.log";
17
18 /**
19 * variable for project audit log
20 * @var string
21 */
22 const AUDIT_LOG = "./ninja_load.audit.log";
23
24 /**
25 * Returns the audit log level
26 * @return string
27 */
28 static function getAuditLogLevel() {
29 return NYTD_Feeds_Config::RUNLEVEL_SUMMARY;
30 }
31
32 /**
33 * Returns the environment
34 * @return string
35 */
36 static function getEnvironment() {
37 return get_cfg_var('my.servertype');
38 }
39
40 /**
41 * Returns the connection information based on environment
42 * @return string
43 */
44 static function getDBConnectionParams($character_set = 'latin1') {
45 switch ( self::getEnvironment() ) {
46 case 'development':
47 $params = array (
48 'host' => 'myserver1.myhost.com',
49 'database' => 'VendorContent_Ninja',
50 'username' => 'ninja',
51 'password' => 'n1nj4',
52 'character_set' => $character_set
53 ); break;
54 case 'staging':
55 $params = array (
56 'host' => 'myserver2.myhost.com',
57 'database' => 'VendorContent_Ninja',
58 'username' => 'ninja',
59 'password' => 'n1nj4',
60 'character_set' => $character_set
61 ); break;
62 case 'production':
63 $params = array (
64 'host' => 'myserver3.myhost.com',
65 'database' => 'VendorContent_Ninja',
66 'username' => 'ninja',
67 'password' => 'n1nj4',
68 'character_set' => $character_set
69 ); break;
70 default:
71 throw new NYTD_Feeds_InternalStateException( 'No db params set for environment ' . self::getEnvironment());
72 }
73 return $params;
74 }
75 }
76 ?>
- Line 16 sets up the variable that will be used for error logging.
- Line 22 sets up the variable that will be used for audit logging.
- Lines 28 through 30 set up the error messages level. See the table below for options.
- Lines 36 through 38 set up a function to return the environment.
- Lines 44 through 74 set up a function to return the database parameters for the current environment.
Here are the various logging levels:
| NYT_Transformer Global Variable | Value | Logging Type |
|---|---|---|
| NYTD_Feeds_Config::RUNLEVEL_NONE | 0 | No audit log generated |
| NYTD_Feeds_Config::RUNLEVEL_SUMMARY | 1 | Show summary only |
| NYTD_Feeds_Config::RUNLEVEL_IO_CONNECTIONS | 2 | Show IO connections, files, databases |
| NYTD_Feeds_Config::RUNLEVEL_FETCHING | 4 | Show input data fetched |
| NYTD_Feeds_Config::RUNLEVEL_PARSING | 8 | Show input data parsed |
| NYTD_Feeds_Config::RUNLEVEL_FILTERING | 16 | Show input data filtered |
| NYTD_Feeds_Config::RUNLEVEL_STORING | 32 | Show data to be stored |
| NYTD_Feeds_Config::RUNLEVEL_SQL | 64 | Show all SQL statements and results |
| NYTD_Feeds_Config::RUNLEVEL_ALL | 255 | Show everything |
Summary
This example has stepped you through the basics of NYT_Transformer and illustrated its features. The Ninja project is small (around 200 lines of code) but can process any size file. And it’s easily adapted to different inputs — if the report format were to change from CSV to pipe-delimited, you’d need to change just one line of code. Use the sample files to start your own project and discover the power and flexibility of NYT_Transformer.
Next: Advanced Functionality »
