CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – III

In the first 2 articles of this series I described a bit of what had to be done on the JS/jQuery client side to trigger the first phase (Phase I) of an Ajax controlled file upload (here: of a Zip container file). See:

CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – I
CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – II

In this third article, we take a first glimpse at what has to be done on the server side – in our case in the PHP 5.4 target program of our Ajax request. There are of course very many and different ways to deal with the initial treatment of an uploaded file. What I normally do is to create a “File_Handler” object based on a class which encapsulates and provides all required methods. However, for the sake of a better understandability the code fragments presented below do not always follow a stringent OO line of code development and are sometimes very basic. We only show code elements that are of fundamental importance.

The reader may remember that the method of our Javascript CtrlO for controlling the upload form (see the last article) called a PHP target program named

handle_uploaded_init_files.php5?file

.
The attached GET-parameter distinguishes the first phase of the upload from several following phases. One of the initial things our PHP program should do is to open/access a PHP session, to start PHP output buffering and to check whether the $_GET element $_GET[‘file’] exists:

Excerpts from the PHP target program of Phase I – handle_uploaded_init_files.php5:

	$time_0 = microtime(true); 
	session_start();
	ob_start(); 
	....
	// Deal with phase I of the upload 
	if ( isset( $_GET['file'] )) {
		...
		$response = handle_transferred_file(); 
		ob_end_clean(); 
		echo $response;  
	}

(Side remark: Note that the attached Get parameter is nevertheless transferred via the POST mechanism of HTTP !)

$response represents a prepared encoded JSON-Object which is sent back to the JS-Ajax-client at the end of the program. By the “if”-statement we just distinguish the actions of Phase I from later phases where we deal with each file transferred in our large Zip-container file. The main work for our phase I is in our example obviously done inside a function “handle_transferred_file()”.

The use of “session_start()” seems to be quite reasonable. The PHP manual
http://php.net/manual/en/session.upload-progress.php
describes that upload progress information is supplied via $_SESSION. This is just the way the progress tracking of PHP 5.4 works!

However, we shall come back to this point at the bottom of this article and we shall see that things are not that simple. But, we may use the $_SESSION array also for keeping other interesting information. Anyway, using a PHP-session (=opening/accessing a session) will do no harm here.

But why do we need ob_start() ?
In the last article of our series the Ajax answer to the client was requested to have the form of a JSON object. So, our Ajax driven client program expects exactly one information stream in the form of a JSON object. However, if your own PHP code accidentally produces strings by executing some echo, print or print_r statements this rule will be violated. The client will receive the first PHP output as the expected Ajax JSON answer and
will not be able to parse it. This would result in an Ajax client error! Another source of unwanted and not JSON encoded output may come from the PHP engine itself which may create warnings and error messages during code execution. Therefore, it is a good habit to gather all such information in the output buffer and put it eventually as a special string element (as “system_mgs”) of an array which we shall encode as the final JSON object. See below.

A function and an array for producing the JSON answer of upload Phase I

Let us now turn to our function dealing with phase I. In our simple test case it basically does 2 things:

  • It creates a singleton object for performing progress tracking and later some file operations. This object then does the necessary things itself and produces an array which we in our example receive in the variable $response.
  • It enriches the response array with some information and encodes it as a JSON object.

That we have split the required actions into these steps is again more for the sake of illustration purposes.

function handle_transferred_file() {
	...
	// Some local variables 
	$ctrl_msg = '';
	
	// Includes
	require_once $my_include_path . 'class_file_handler.php5';
	require_once $my_include_path . 'class_file_handler_params.php5';
		
	// Parameter Singleton with parameters to control the file handling  	
	$F_Params = new FileHandlerParams(); 
		
	// File handler object (singleton) 	
	$F_Handler = new Basic_Class_File_Handler( $F_Params );
	if (isset($FCheck->msg)) {
		$ctrl_msg .= "\r\nFCheck initialized";
		$ctrl_msg .= "\r\nInitial msg = " . $FCheck->msg;
	}

	// Transfer the file to its directory location and unzip its contents 
	$F_Handler->check_and_save_uploaded_file(); 

	// Required time and response enrichment 		
	$time_f = microtime(true);
	$dtime_f = $time_f - $time_0;
	$F_Handler->ay_ajax_response_1["transfer_time"] = $dtime_f;
	
	// Response enrichment by system messages form the output buffer 
	$F_Handler->ay_ajax_response_1['sys_msg'] .= ob_get_contents(); 
	
	// return JSON encoded array 
	$response =json_encode($F_Handler->ay_ajax_response_1);
	return $response; 
}

This is fairly simple to understand: We load some class definitions – one for the handling of the file transferred into the $_FILE superglobal and one with a bunch of parameters. We then create a singleton object $F_Handler which does most of the required actions of Phase I. Its property “$F_Handler->ay_ajax_response_1” obviously is an array which is enriched by the contents of the PHP output buffer and some time information (just for illustration purposes). This array is eventually encoded as the required JSON answer object.

Typically the element $F_Handler->ay_ajax_response_1[‘sys_msg’] is used on the client side for tests only via the console.log() statement of Javascript.

Possible parameters of the File Handler object $F_Handler

What kind of parameters may be required? E.g.: the path of the directory where we want to save our transferred Zip-file and/or its contents on the server. The expected file ending. The maximum file size allowed. A parameter defining whether we really want to check the progress of the upload. The reader may think about more useful parameters.

In our example such parameters can be gathered and maintained via properties of the class “FileHandlerParams”. The F_Handler object may extract them from the respective Parameter object given as an argument to its constructor, and write them afterward into internal property variables.

Main elements of a Basic_File_Handler class

The “Basic_File_Handler_Class of our example may contain the following methods:

class Basic_Class_File_Handler {
	
	var $file_key; 		// Key of uploaded file in $_FILES[$key]
	var $file_name = ''; 	// Name of present file (uploaded or part of a pipeline)  
	var $file_mime = ''; 	// Mime-Type of present file 
	var $file_end = ''; 	// Suffix (ending) of present file 
	
	var $file_expected_end = array("csv", "zip");	// Allowed suffixes of the transferred file 
	
	var $file_size; 		// Size of present file 
	var $max_file_size = 100000000;	// Maximum allowed size 100MBytes
	var $check_progress = 0; 	// Check progress by means of PHP 5.4  ?
	var $sess_key_progress = ''; 	// The key of $_SESSION where to find upload progress infos 
	
	// Upload dir 
	var $upl_dir;  			// The given short name of the dir without root path 
	var $target_dir; 		// The full path of the target dir on the PHP server 

	var $ziel_file_name_oe; 	// Target name of file without ending and "."
	var $ziel_file_name; 		// Full target name of file 
	var $ziel_file_pfad; 		// Full target path relative to web domain dir  
	var $ziel_file_pfad_dom;	// Full target path rel. to PHP root on the server 

	var $root;   			// rel. path of present application to PHP application root
	var domain_root;		// rel. path of present application to Web domain root 
	
	var $upload_success = 0; 
	var $msg = ''; 
	var $sys_msg = ''; 
	var $err = 0; 
	var $err_msg = '';  
	
	// zip file info 
	var $num_extracted_files = 1; 
	var $file_pipeline = 0;  // will be set to 1 if the zip-file contains more than 1 files 
	
	// Ajax response array 
	var $ay_ajax_response_1 = array();   // For phase 1 	
	
	.....
			
// Main methods 
// -----------
	// constructor 
	function __construct($Params)  {
 		...
	}

	// function to check properties of the uploaded file and save it in a target directory on the server  
	function check_and_save_uploaded_file() {
		....
	}		
	
	// Method to prepare Ajax (JSON) response array for phase 1 of upload 
	function prepare_ajax_response_phase_1() {
		...
	}
		
	// Check existence of transferred file in $_FILES
	function check_file_existence_and_props() {
		...
	}
	
	// Check suffix = file ending 
	function get_and_check_file_suffix() {
		...
	}
	
	// Check File_Size
	function check_file_size() {
		...
	}
		
	// Delete all existing files in the upload dir 
	function delete_files_from_upload_dir() {
		...
	}
	
	// Move the uploaded file or the content files of an uploaded zip file to the target directory 
	function move_file_to_upload_dir() {
		...
	}
	
	// function to determine the number of extracted files and the name of the next file to handle  
	function determine_num_files_and_next_file_name() {
		...
	}
	
	// Method to handle a zip archive 
	function handle_zip_archive( $dest_file, $dest_dir) {
		...
	}
	...
	...

// End of class definition  
}

We shall look at some details of these methods in the articles to come.

The key to session information about the progress of the ongoing upload progress

The constructor of our Basic_File_Handler class gets the task to read in parameters. In addition we use it here and try to retrieve some (initial?) progress information from the $_SESSION array – just for learning purposes. Actually, we shall see that this trial will NOT give us any progress information at all or only a trivial one.

function __construct($Params_ext) {
	
	// read external parameters 
	$num_args = func_num_args(); 
	if (num_args == 1 ) {
		$Params = func_get_arg(0);
		if ( is_object($Params) && get_class($Params) == "FileHandlerParams" ) {
			$this->file_key  = $Params->file_key; 
			if( is_array($Params->file_types) && count($Params->file_type) > 0 ) {
				$this->file_type = $Params->file_types; 
			}
			$this->upl_dir 	= $Params->upload_dir . "/";
			$this->max_file_size	= $Params->max_file_size; 
			$this->check_progress= $Params->check_progress; 
		}
	}
	// Error treatment 
	else {
		...
		$this->ay_ajax_response_1[sys_msg] .= "\r\nWrong Parameter object!"; 
	}		
		
	// Test the progress information in $_SESSION (Is this reasonable here ???) 
	if ( $this->check_progress == 1 ) {
		// Get the required string end for the key of progress info in the $_SESSION array
		// This information was delivered via $_POST by the Ajax client program 
		$this->sess_key_progress = '';
		$key_POST = ini_get("session.upload_progress.name");
		if (isset( $_POST[$key_POST] ) ) {
			$this->sess_key_progress .= ini_get("session.upload_progress.prefix"). $_POST[$key_POST];
			$this->sys_msg .= "<br>sess_key_progress = " . $this->sess_key_progress;  
			$this->ay_ajax_response_1['sess_key_progress'] = $this->sess_key_progress;  
			$_SESSION['progress_key'] = $this->sess_key_progress;
			
			// Write a test value into a special Session variable 
			$current = -1;
			if (isset($_SESSION[$this->sess_key_progress]) && !empty($_SESSION[$this->sess_key_progress])) {
				$current 	= $_SESSION[$this->sess_key_progress]["bytes_processed"];
				$total 		= $_SESSION[$this->sess_key_progress]["content_length"];
				$current	= ($current < $total) ? ceil($current / $total * 100) : 100;
			}
			$this->sys_msg .= "<br>Initial progress value of the file transfer was " . $current; 
		}
	}	
		
	// set the TARGET DIR for saving the file
	$this->target_dir 	= $this->root . $this->upl_dir;
			
	return;
}

 
$this->check_progress is set by the parameters transferred.

Note that progress tracking and the delivery of related information in the $_SESSION array requires the following setting in the php.ini file (“/etc/php5/apache2/php.ini” on Opensuse)

; Enable upload progress tracking in $_SESSION
; http://php.net/session.upload-progress.enabled
session.upload_progress.enabled = On

In the lower part of the PHP code we see how the key for the element of the $_SESSION array which contains information about the upload progress is composed:

We first need a parameter from the php.ini file which we read by ini_get(“session.upload_progress.prefix”). Then we need another parameter of our php.ini file on the server which gives us a special index of the $_POST array:
$key_POST = ini_get(“session.upload_progress.name”).
This index points to an element of the $_POST array which (hopefully) contains a unique identifier of our presently progressing upload process. We must submit this identifier by our Ajax client method initiating the whole file transfer; see the last article in this series. Eventually, we combine both pieces of information to get the index ($this->sess_key_progress) of an element of the $_SESSION array which will contain progress information about our upload.

In the code above we write this information into our output array as part of an Ajax response.

When and how is the progress information
available in the $_SESSION array?

The next code section then uses our composed key to read a test status value of the upload (at the point of code execution) from the $_SESSION array. Now let us assume that we really used our coding above – what values would we get ? Some integer numbers between 0 and 100 ?

The answer clearly is NO – we would always get “-1”. Why ?

You may say – maybe the connection is too fast and the $_SESSION variables may get erased when the upload has finalized. Well there is some truth in this also:

If we really wanted to test progress tracking in a LAN we may need to reduce the bandwidth of your network connection. Information about how one can achieve this can be found in the article
Geschwindigkeit/Übertragungsrate eines Netzwerkinterfaces unter Linux reduzieren

And yes – depending on the parameter setting in the “php.ini”-file the progress related elements of the $_SESSION array may get eliminated:

; Cleanup the progress information as soon as all POST data has been read
; (i.e. upload completed).
; Default Value: On
; Development Value: On
; Production Value: On
; http://php.net/session.upload-progress.cleanup
session.upload_progress.cleanup = On

But even if you reduced your bandwidth considerably in comparison to your transferred file you would nevertheless only get “-1” ! And even if you set the cleanup parameter to “Off” you would only get a trivial answer: 100.

The real reason for our failure is simply that the code execution of our PHP target program only starts when all data sent via the POST mechanism are completely received – this includes the file data!

This is very logical! The core purpose of the PHP job triggered by our file upload form in the last article is to deal with the uploaded file. It cannot be used for progress tracking by principle! We need a different and independent mechanism – in our case independent polling jobs.

So, when the PHP code listed above is executed on the server the file is already fully uploaded and the $_SESSION elements of $_SESSION which contained the progress information may no longer exist if the cleanup parameter is ON in php.ini settings!

The fact that the PHP job started by the upload form is of no use for progress tracking has two important implications:

First: By what should or could the PHP session for progress tracking be started at all in our Ajax controlled job environment? One assumption could be that the PHP engine does it by itself as soon as it somehow recognizes upload circumstances. However, this is not the case – and uncontrolled automatic session starts would actually introduce security risks into PHP. In fact we have reached an important point :

The PHP session must have been started already before our Ajax controlled file transmission from the client starts!

Otherwise the whole progress tracking process will not work! It uses a PHP session that must have been opened before!

How could we do this? Now, I may remind the kind reader about a note in the last article: Our HTML page with the upload form was created by a PHP program – which itself may use method of the classes of a template engine like Smarty. So, the PHP program that creates our initial web page already could open the required PHP session! An alternative would be that our client starts a precursor Ajax job with the sole purpose of opening a PHP session. Not very efficient, but possible.

And another aspect has become very clear now that would also be valid for independent polling jobs:

If we want to access the progress information in the $_SESSION array about the ongoing upload process we MUST have the $_
POST information about the upload process identifier available as the first information reaching the server – i.e. before the file data themselves start running in! This is the very reason why we had to take care about the order by which data are sent from the client to the server – see the related remarks in the previous article!

Note in addition that we did not destroy the session or unset its cookie at the end of our target program program function for Phase I. The reason is that we may use the session also in later phases.

Enough for today. Please, note also that the only substantial and effective things we have discussed so far on the PHP side were:

  • to access a hopefully already existing (!) PHP session,
  • to initiate the PHP output buffer

However, we have learned something about the key we use for accessing progress information in the $_SESSION array and understood its relation to a $_POST parameter, which must be provided by the client.

We shall come back to the treatment of the fully uploaded file by the methods listed above in a later article.
In the next article

CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – IV

we shall instead look at the required sequence of Ajax controlled “polling” jobs started by the client to retrieve information about the upload progress status from the server. Independently of our main PHP job ….