CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – II

In the first article of this series I summarized which phases and steps we may have to cover during an Ajax driven file upload procedure for a ZIP container file. See:

CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – I

In this second article, I shall describe in more detail what has to be done during the first step identified for Phase I – the file transfer phase, where a ZIP container file is to be sent over the Internet from the client to the server. We look at the client side and describe some special settings and “tricks” which are required to enable a working and parallel transfer of the ZIP container file and other parameter data to the server via jQuery’s Ajax interface.

Note that the information given below should be regarded as a collection of suggestions. The code snippets are enhanced to make them understandable. Everything can of course be programmed in a different and more efficient way.

We start with some minor preparations to take care of:

Naming conventions for the CSV-files of the Zip container and a related database table

The server must get a chance to distinguish between the required actions for the handling of each of the CSV files in the transferred Zip container. We do this by setting up a small table “fileToTbl” in the LAMP server’s database. We use this table to associate a file name with a target table of the database. A file “alpha.csv” may such be associated with a DB table named “alpha”.

If there are several files referring to a split sequence of data for one and the same target database table the CSV file names may get an integer suffix like “alpha_1.csv, alpha_2.csv, …” – which indicates a loading sequence. The server can extract relevant parts of the file names by the PHP function “explode()” during the treatment of the transferred files.

The table “fileToTbl” may furthermore contain a column “onr” which via an integer defines the standard order by which all possibly received files should be imported into the database. Thus, typical columns would be:

nr, file_name, db_table_name, onr

It is the user’s responsibility to fill only CSV files with the defined names into the ZIP-container to be uploaded. If during ZIP file extraction files are detected which do not fulfill the naming conventions the server should issue a warning in its Ajax response.

Note that although the table “fileToTbl” includes information about all possible files and their names, the number of CSV files send in a specific upload process may be much lower than the maximum possible number. The user may want to update only some input data tables.

Control array for a database import pipeline

The previously described measure enables the server to build up an ordered “pipeline” for loading the received CSV files sequentially into their target tables in Phases II, III, …. When the target PHP program of Phase I server extracts the files from the transferred ZIP-container it can analyze the file names and build up an ordered sequence for the import according to the information found in the table “fileToTbl”.

The information about the loading sequence can be encoded into an array. We call this array “DB Import Control array”, short DBIC-array. It can and will be updated at the end of each Ajax phase and will also exchanged with the client. The (associative) BDIC array should contain information about which files have already been loaded, which files still are to be loaded
and if an error occured. Therefore, elements of the array could be:

dbic[i][‘fname’] = ‘Name_of_the_i-th_file’;
dbic[i][‘fsize’] = ‘Size_of_the_i-th_file’;
dbic[i][‘ftbl’] = ‘Name_of_the_target_DB_table’;
dbic[i][‘ftbl_nr’] = ‘Number_of_the_relevant_record_in_the_fileToTbl_table’;
dbic[i][‘pipe’] = 1; // ( or 0 1: file is part of an import pipeline, 0: only one file,no pipe)
dbic[i][‘loaded’] = 0; // ( or 1 if already loaded)
dbic[i][‘err’] = ‘error_text_or_error_code’;

“i” is an index counting the files according to the loading order defined in “fileToTbl”.

The array will be maintained in PHP’s $_SESSION and shall in addition be part of a JSON object transferred to the client as the Ajax answer of each client/server interaction phase until all files are processed. See again the previous article for the defined phases and steps.

Elements of a simple upload form

Let us now turn the the client and the key preparations there for initializing Phase I. We first build a very simple HTML 4 compatible form to trigger the upload job. We shall extend this web page area later on by progress bars. In the beginning our web page area for the upload shall look as simple as this:

upl_1

In our project this Web UI area and its contained FORM tag are part of a PHP template (ITX or Smarty TPL) whose placeholder variables are filled by a PHP web page generator program. The TPL handling is assumed to be performed by a specific Template Control Object [TCO] which uses functionality of the chosen template engine. In our examples below we refer to the ITX case where placeholders of the form “{PLACE_HOLDER_NAME}” are to be filled. The template object shall be identified in our PHP code examples by a variable “tpl” of the TCO as $this->tpl.

In the following HTML code of our template we have used speaking IDs. We leave the simple CSS formatting to the reader. Concentrate instead on the FORM tag defined:

<!-- BEGIN UPLOAD -->											
<div id="div_upload_cont">
	<div id="div_upload">
		<div id="upl_float_cont">
			<div id="div_upl_header">
				<p id="upl_header"><span></span></p>
			</div>
			<div id="csv_file">
				<p><span class="fsnorm">CSV-files: </span><span id="num_open_files" class="bred">0</span></p>
			</div>
			<div id="imp_file">
				<p><span class="fsnorm">Imported-files: </span><span id="num_extracted_files" class="bgreen">0</span></p>
			</div>
			<p class="floatstopboth"> </p>
		</div>
		<form id="form_upload" name="init_file_form"  action="handle_uploaded_init_files.php5" method="POST" enctype="multipart/form-data" >

			<input type="hidden" name="{SESS_UPL_PROGR_FIELD_NAME}" id="hinp_progress_key_name" value="upl">
			
			<div id="file_cont">
				<input type="file" name="init_file" id="inp_upl_file" >
				<a id="but_submit_upl"  class="basic_but" href="#">Start Upload</a>
				<p class="float_stop"> </p>
			</div>
			
			<input type="hidden" name="upl_tbl_num" id="hinp_upl_tbl_num">
			<input type="hidden" name="upl_tbl_name" id="hinp_upl_tbl_name">
			<input type="hidden" name="upl_
tbl_snr" id="hinp_upl_tbl_snr">
			<input type="hidden" name="upl_file_succ" id="hinp_upl_succ" value="0">
			<input type="hidden" name="upl_file_name" id="hinp_upl_file_name" value="0">
			<input type="hidden" name="upl_file_pipe" id="hinp_upl_file_pipe" value="0">
			<input type="hidden" name="rt" id="hinp_upl_run_type">
		</form>
	</div>
</div>												
<!-- END UPLOAD -->

 
The “init” in some names is unimportant and project specific. In our present context you may assume it indicates Phase I.

The whole area of the enclosing DIV (id=”div_upload_cont”) is controlled on the JS side by an associated Control object “CtrlO_FileUpl” derived from a respective class (see below). The button defined above is used to trigger the file transfer process. This is done by a specific CtrlO method. Therefore, neither a form submit button is used, nor a special “href”-definition is required.

Note the variety of hidden input fields defined inside the FORM tag. Most of these fields will be used in several subsequent Ajax phases until the complete upload process is finalized.

However, during the starting Phase I only the file input field and the first hidden input field are relevant. The first and somewhat special input field can be identified in the code above by a private attribute: lbl=”progress”. (This attribute is otherwise unimportant). The “name” attribute of this input field is determined by a template placeholder which is replaced by the PHP TCO during web page creation.

	<input type="hidden" lbl="progress" name="{SESSION_UPLOAD_PROGRESS_NAME}" id="hinp_progress_name" value="upl">

 
Note: The position of the first special input was chosen on purpose. We deliver an explanation why the order of the data fields may become important during the POST transfer in a later section of this article.

I want to add a second note here, whose importance we shall understand later in this article series. As we have assumed the web page containing the upload form will be created by a PHP program with the help of a TCO. Now:

The PHP program creating the web page with the template based upload form must initiate a PHP session !

What is the (first) special input field used for?

Our first hidden input field must exist to trigger the initialization of the provision of progress data for the file transfer on the server. The server will look for the name and value of this input variable. The name has to follow rules so that the server recognizes it as special. The value will be used to define a key for accessing progress information in the $_SESSION array.

Actually, the name of the input field has to be identical to the value of the following PHP ini-variable:

; The index name (concatenated with the prefix) in $_SESSION
; containing the upload progress information
; Default Value: “PHP_SESSION_UPLOAD_PROGRESS”
; Development Value: “PHP_SESSION_UPLOAD_PROGRESS”
; Production Value: “PHP_SESSION_UPLOAD_PROGRESS”
; http://php.net/session.upload-progress.name
session.upload_progress.name = “PHP_SESSION_UPLOAD_PROGRESS”

This variable is defined in the php.ini file on the Apache server (normally it is located at “/etc/php5/apache2/php.ini”). It is essential that the server receives data associated with this name in the $_POST or $_GET array; otherwise the PHP engine will not care for the supply of upload progress information.

r
In a PHP code we can retrieve the value of the required name (from the variable settings in the PHP ini-file) by using the PHP function get_ini(). A PHP code snippet of the page generator TCO for filling our ITX-Block would look like:

 // Treatment of Upload Area 
$upload_block = "UPLOAD"; 
$tpl_hinp_upl_progr_field_name = "SESS_UPL_PROGR_FIELD_NAME"; 
$val_hinp_upl_progr_field_name = ini_get("session.upload_progress.name");
if ( $this->show_upload == 1 ) {
	$this->tpl->setCurrentBlock($upload_block);
		$this->tpl->setVariable($tpl_hinp_upl_progr_field_name, $val_hinp_upl_progr_field_name);
	$this->tpl->parseCurrentBlock();
}		 		

 
leading to

<input type="hidden" lbl="progress" name="PHP_SESSION_UPLOAD_PROGRESS" id="hinp_progress_key_name" value="upl">

 
inside the created web page. The structure of the code snippet would be very similar in case of the SMARTY engine.

Note: The defined value “upl” of the now named input field will later on be used to compose the key which we need to identify the array element containing progress information in the $_SESSION array.

The Javascript CtrlO for the Upload Area

As we have said, we control all event related and all Ajax action – e.g. of course in the initial phase I – by a defined CtrlO on the JS side. Such a singleton object responsible for our upload form may be derived from the following “class” definition:

GOC.CtrlO_FileUpl = new Ctrl_File_Upl('CtrlO_FileUpl');   

function Ctrl_File_Upl(my_name) {
	
	this.obj_name = "Obj_" + my_name;
	this.GOC = GOC; 
....
	// Timeout for file transfer process
	this.timeout = 100000; 

....
	// define selectors of the div and form 
		this.div_upload_cont_sel 	= "#" + "div_upload_cont";
		this.div_upload_sel 		= "#" + "div_upload";
		this.p_header_upload_sel 	= "#" + "upl_header" + " > span";
		this.form_upload_sel 		= "#" + "form_upload";
		this.input_file_sel 		= "#" + "inp_upl_file";
		this.upl_submit_but 		= "#" + "but_submit_upl";
		
		this.hinp_upl_tbl_num_sel	= "#" + "hinp_upl_tbl_num";			
		this.hinp_upl_tbl_name_sel	= "#" + "hinp_upl_tbl_name";			
		this.hinp_upl_tbl_snr_sel	= "#" + "hinp_upl_tbl_snr";			
		this.hinp_upl_succ_sel 		= "#" + "hinp_upl_succ";			
		this.hinp_upl_run_type_sel 	= "#" + "hinp_upl_run_type";			
		this.hinp_upl_file_name_sel 	= "#" + "hinp_upl_file_name";			
		this.hinp_upl_file_pipe_sel 	= "#" + "hinp_upl_file_pipe";			
		
		this.num_open_files_sel		= '#' + "num_open_files";
		this.num_extracted_files_sel	= '#' + "num_extracted_files";
...
	// Determine URL for the Form 
		this.url = $(this.form_upload_sel).attr('action'); 
		console.log("Form_Upload_file - url = " + this.url);  				

	// Register events with jQquery 
		this.register_form_events(); 
....
}

// Method to register events   
Ctrl_File_Upl.prototype.register_form_events = function() {
	// this indirectly also calls the secondly defined proxy method below 
	$(this.upl_submit_but).click(
		$.proxy(this, 'submit_form') 
	);
	// The real method called 	
	$(this.form_upload_sel).submit( 
		$.proxy(this, 'upl_file') 
	); 
}; 

 
We shall look at other methods of such an object later on. Let us first look a bit closer at some of the above definitions.

Side aspects: The GOC object indicated in the first
line is a special object which controls singleton objects in our JS code. I call such an object “Global Object Controller”. Thus we avoid placing the variety of our specific CtrlO objects into the global JS space. Note that such a GOC can also be used for dispatching knowledge about all created singleton CtrlO objects to all objects (e.g. each of the CtrlOs) which may need to know about their existence. Although maybe interesting in itself we do not look at the GOC in detail in this article series.

The variables defined in the beginning of our class definition will be used as jQuery selectors for several objects of our upload container DIV in the CtrlO methods defined below.

The URL of the target PHP program to be addressed by the Ajax request of Phase I is in our example read from the related attribute of the HTML form tag. See the HTML code above for it.

Important: Note the use of jQuery’s $.proxy-mechanism to encapsulate event control in methods of the CtrlO. The definitions given associate a specific event occuring at one of the HTML events with a CtrlO method.

Read more about using $.proxy in the jQuery documentation https://api.jquery.com/jQuery.proxy/. The “trick” here is to define the proper context for the JS “this“-operator used later in the triggered objects methods; the context has to be switched explicitly from the HTML element affected by the user event to the CtrlO object’s method. $.proxy does this for us in a simple, elegant way. You may read more about this trick in another article series of this blog
Fallen beim Statuscheck lang laufender PHP-Jobs mit Ajax – III

We shall look at the central method “upl_file” for starting the upload in a minute.

First obstacle: A method to transfer files and POST data at the same time via jQuery’s Ajax interface

Unfortunately, the transfer of file data from a standard form is not as simple via jQuery’s Ajax interface as soem kind readers may expect. E.g. you run into trouble, if you want to transfer normal input/textarea data from a form together with file (data map) data to a (PHP) server at the same time via the POST mechanism. This is due to the fact that standard settings for the Ajax interface of jQuery may not cover what is required both for file transfer and standard data transfer. Data from standard input elements must be processed to appear in the form of a query string, fitting to the default content-type “application/x-www-form-urlencoded”. However, the corresponding $.ajax settings do not work with file uploads. This is described in the following articles:
http://stackoverflow.com/questions/5392344/sending-multipart-formdata-with-jquery-ajax
http://abandon.ie/notebook/simple-file-uploads-using-jquery-ajax

What is required to overcome this problem? As described in the named articles a suitable step is to define an internal “FormData” object and attach the information gathered in the relevant input fields of our HTML FORM to this object. The data of the internal “FormData” object are then used in the Ajax controlled transfer with some special parametrization of the Ajax environment (more precise: of the XMLHttpRequest object).

In discussions with some JS developers used to conventional JS coding most find this approach more confusing than helpful. Actually, I personally find it elegant and fully in
line with my general attitude of using internal objects and their methods to control all aspects of user interaction, events and Ajax communication.

The resulting method “upl_file” to start the upload via Ajax looks as follows – note especially the creation of the FormData object :

Ctrl_File_Upl.prototype.submit_form = function (e) {
	e.preventDefault();
        $(this.form_upload_sel).submit(); 
};

Ctrl_File_Upl.prototype.upl_file = function(e) { 
	// Prevent Default action 
	e.preventDefault();
	
	// Set cursor to wait 
	$('body').css('cursor', 'wait' ); 

	// Some variables
	var form_data, url;
				
	// The identification key for the uploaded file in the server's $_Files
	var file_id_key = 'init_file';
				
	// Reset the values for the numbers of uploaded/open files 
	this.num_extracted_files = 0; 
	this.num_open_files = 0; 
	$(this.num_extracted_files_sel).html(this.num_extracted_files); 
	$(this.num_open_files_sel).html(this.num_open_files); 
....			
....			
	// Create a FormData object   
	form_data = new FormData();

	// Firstly (!!), add the hidden data to the DataForm object  
	var params = $(this.form_upload_sel).serializeArray();
	$.each(params, function (i, val) {
		form_data.append(val.name, val.value);
	});	

	// Secondly, fill the form with the information of the chosen file 
	// for the File API supported in present FF 
	$.each($(this.input_file_sel)[0].files, function(i, file) {
		if (i == 0) {
			form_data.append(file_id_key, file);
		}
	});
		/*	 
		// Multiple file selections in HTML 5 
		$.each($(this.input_file_sel)[0].files, function(key,value) {
			form_data.append(key, value); 
		});
		*/

	// "file" will be set and analyzed as a GET parameter to the PHP target url  
	url = this.url + "?file";
				
	// Time measures 
	this.date_start = new Date(); 
	this.ajax_transfer_start = this.date_start.getTime(); 
	console.log("From Ctrl_File_Upl.success_ajax_file_upl() ::  ajax_start = " + this.ajax_transfer_start);  

	// Setup Ajax 
	$.ajax({
		// contentType: "application/x-www-form-urlencoded; charset=ISO-8859-1",
		url: url, 
		context:  GOC[this.obj_name],
		timeout: 100000,
		data: form_data, 
		type: 'POST', 
		cache: false, 
		dataType: 'json', 
		contentType: false,
		processData: false, 
	
		error: this.error_ajax_file_upl,
		success: this.success_ajax_file_upl
	});
};	

 
Let us discuss some aspects of this method in detail:

  • e.preventDefault is used to prevent that the event triggers a standard reaction of the affected HTML elements. The respective standard event capture and bubbling phases throughout the HTML element hierarchy are interrupted and only the defined CtrlO method code is executed.
  • We define a key name “init_file” for our file to be able later on to identify its precisely in the PHP superglobal array $_FILES. This is not only done for convenience reasons: Although we only upload exactly one file in our present example based on HTML 4.1, we should be prepared to extend our methods to possible multi-file selection options of the modern HTML 5 file upload API.
  • In the beginning of Phase I no files of the ZIP container were processed yet. Therefore, we set the numbers in the respective fields of our form to zero.
  • We create the required internal “FormData”-object.
  • Important:
    We append the special input field defining the key for upload progress information in the $_SESSION array first – i.e.
    before we append any file data.
     
    Please, do not ignore this point! It took me hours in the beginning to find out that a different order in the data transfer really leads to a complete failure of the whole concept of providing progress data in the $_SESSION superglobal ! Much later and by chance I found a related hint in one of the comments of PHP’s documentation http://php.net/manual/de/session.upload-progress.php

    The point is that the target PHP program of the (Ajax controlled) transfer process receives all data via the POST mechanism – but the PHP 5.4 engine has to recognize already in the very beginning of the transfer that a data upload whose progress shall be followed is initiated. And here an immediate filling of the relevant $_POST-array field is absolutely necessary before the file data appear in the POST buffer. I never had thought about whether there is a ordered sequence of information transfer during the POST process – but there is ! Data for the first fields of a form are transferred first! So, to be consistent always keep the special input field at the top of all other fields providing data in your HTLM FORM tag. In our case the POST data stream is actually derived from the elements of the internal FormData object – but there the same ordering rules are valid. Therefore, we append the data of this input field first to the FormData object.

  • We retrieve the information of the HTML FORM by jQuery’s serialzeArray() functionality. The trick with

    params = $(this.form_upload_sel).serializeArray();

    is that the method $.serializeArray() of the jQuery object does not serialize input data for files or buttons of the HTML FORM tag. So, we only add the values of the hidden input arrays – and among these our special parameter.

  • We now read the information about the file(s) selected in our HTML FORM – we do this already in form of a loop over all possible files of a HTML 5 multiselection field – although we do not really use multiple file selection in our example case. The selector variant in our code is valid for present Firefox browsers which supports a modern HTML5 file API (also in HTML 4.1 code). See e.g.:
    https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications
    http://www.sanwebe.com/2013/10/check-input-file-size-before-submit-file-api-jquery
    http://stackoverflow.com/questions/5392344/sending-multipart-formdata-with-jquery-ajax

    Annotation: It would in our case also have worked without the first [0] as there are no more matching elements for the selector; using the [0] seems however to be good style … For MS IE you need probably version MS IE 10 or 11. I have not tested this.

  • We supply a GET-parameter “file” to our url to explicitly distinguish Phase I from later phases – where we shall define another parameter.
  • Eventually we set up our Ajax environment and trigger the Ajax communication via jQuery’s $.ajax() method.
     
    Important: Note again that we explicitly set the context for the this-operator of the Ajax interaction environment to our present CtrlO, which itself is an element of the mentioned GOC. Only by using this trick, we can be sure that the “this”-operator in the method used to handle the Ajax response later will
    refer to our present CtrlO. This is really important; otherwise the context would refer to the HTML object triggering the Ajax communication.
  • Note also another the important setting: processData: false
    This prevents jQuery’s Ajax interface from changing data maps (as our file data) into the form of GET variables – thus making it possible to transfer the file data correctly via jQuery’s Ajax interface functions. By setting “contentType: false” we tell the Ajax interface in addition not to care for data types.
  • Last but not least we define methods of our present CtrlO object “GOC.CtrlO_FileUpl” to be responsible for dealing with errors or the Ajax response object in case of a successful communication cycle of our Phase I. We shall look at these methods in a later article.

Note that according to our present setup the target PHP program addressed by the “url” will have to care about input data arriving in the following three superglobals:

$_GET, $_POST and of course $_FILES

Enough for today. We shall see what happens on the server side in the next article of this series to come. At least, we have set up everything such that the server can recognize at the beginning of the data transfer that the progress of the file data transmission shall be tracked.

Please, be a bit patient. The next 3 weeks i am involved in a different project. But the article

CSV file upload with Zip containers, jQuery, Ajax and PHP 5.4 status tracking – III

will be written.

CSV file upload with ZIP containers, jQuery, Ajax and PHP 5.4 progress tracking – I

This article series is written in support for French colleagues in a PHP collaboration project and therefore in English. I want to describe some basic elements of an

Ajax controlled file upload process between

  • a browser based User Interface (HTML4/5, Javascript, jQuery)
  • and some PHP/MySQL application programs on a LAMP server.

Our customer’s project depends on a periodic transfer of up to 40 different CSV files with a lot of input data (around 0.5 GByte) to a database server. A requirement of our customer was that the data transfer should be performed with a ZIP file as a container for the individual CSV data files. After the transfer the contents of the individual CSV files should be imported into specific tables of a database.

As our whole interface of the web application is Ajax based, we decided to control all transfers via jQuery’s Ajax API. Meanwhile, there are jQuery Plugins available for this type of task. However, we wanted to fully control all important phases of the file transfer and the data import – both on the client (browser) as well as on the server. This meant that we needed to program all basic steps during an Ajax communication cycle between the browser client and the LAMP server by ourselves. In addition we needed to guarantee some error control.

Personally, I found it a bit astonishing that such a seemingly simple task lead me to some relatively intricate obstacles to overcome. Although most of the necessary ingredients are documented on the Internet, the documentation is sparse and distributed. My objective with this article series is to provide a coherent picture of process design aspects, some coding tricks and also limitations of such a process. This may be useful also for other developers having to solve similar problems. However, if you want to read about a most simple and problem free approach to file upload tasks with Ajax you are probably looking at the wrong article.

Objectives and Assumptions

  • We want to upload several CSV-files (up to 40), whose contents shall be transferred to specific database tables.
  • These files shall be sent to the server in one ZIP file. Reasons for using a ZIP-container file: compression; limitations of the HTML4 file upload API.
  • As the Zip-container may get relatively large we want to see and control the transfer progress over the Internet by some means of PHP 5.4 – as far as possible today.
  • The server shall extract the files from the ZIP and build up a “pipeline” of these files for a subsequent database import of their contents.
  • The data import into database tables shall be done by a sequence of Ajax controlled PHP jobs. Reason: Intermediate information transfer to the client with the option to stop further processing.
  • The server shall decide by some naming conventions what to do with each file.
  • All steps shall be Ajax controlled – a relatively continuous flow of information between client and server has to be established.
  • For the sake of simplicity each Ajax answer of the server at the end of each controlled Ajax transaction cycle shall be encoded in form of a JSON object. (So, if you want to be particularly precise: we use Ajaj instead of Ajax.)

Wording used in this article series

  • JS, jQuery, Ajax:JS below stands for Javascript (on the
    client side). We furthermore use jQuery and its Ajax interface functionality. We expect JSON responses from the server. Although not completely correct we nevertheless use Ajax and Ajaj as synonyms in the articles of this series.
  • Upload: By “upload” we normally mean the whole process. It comprises a “file transfer process” from the Web client PC to the server and subsequent “database import processes”. However, sometimes and for reasons of simplicity we also use the expression “file upload” in a restricted sense – namely for the file transfer to the server, only. It should become clear from the context what we mean.
  • Main PHP program/job:
    The PHP program receiving and working with the transferred Zip-container file and its contents is called “main PHP program” or “main PHP job”. It has to be distinguished from “polling jobs” (see below).
  • Polling jobs: A sequence of additional PHP “polling jobs” may be triggered by the client. This is done in form of a time loop with a short period. A “polling job” on the LAMP server reads some status information of a previously started and still running “Main PHP job” (as e.g. the file transfer job or long lasting database import jobs). The status information of the running main job is fetched from a common data source as the $_SESSION or a database table accessible to both the status writing “main PHP job” and the status reading “polling job”. Each short timed “polling job” fulfills its own complete Ajax transaction cycle. The evaluation of the Ajax response triggers the next polling job if the main PHP job is still running. We come back to the concept of “Ajax driven polling jobs” later on.
  • CtrlOs: Each user interaction area of a web page – e.g. a HTML FORM in a DIV container – shall be completely controlled by a so called JS “Control object” [CtrlO]. A CtrlO encapsulates all reactions of the UI to events in well defined prototype methods. A CtrlO uses jQuery’s proxy mechanism to register events and delegate event handling to defined CtrlO methods. CtrlO methods furthermore control the Ajax communication with the server.
  • Phases: “Phases” describe a full cycle of defined Ajax interaction between client and server. An example of such a full cycle would be:

    HTML Form => Ajax Setup via JS CtrlO method => Submit via JS CtrlO method => POST/FILE data transfer => Server action (PHP) => JSON object as Ajax response => Client analysis of the JSON object via CtrlO method for the Ajax response

  • Client: The client is in our case typically a browser (Firefox) with active JS and jQuery. We do not care about specific requirements of MS IE browsers in this article series; but we assume that at least MS IE browsers > 10 should work.
  • Pipeline: The ordered sequence by which the files of the transferred ZIP container are imported into their related database.

Relevant phases

To get a more detailed overview over what is to be done we distinguish the following main phases and steps (I omit error handling in this overview which may occur at every step):

Phase I – file transfer, progress control and Zip extraction

Step I.1 – Client: Use a HTML form to choose a ZIP file (<input type=”file”>) and use methods of a specifically designed JS
Control object [CtrlO] to control subsequent actions on the client. Add parameter data (hidden input fields) and prepare an Ajax transaction for the file upload (=transfer) process.

Step I.2 – Client: Start the transfer the ZIP file over the Internet to the server. Submit a special parameter in addition to the file to trigger the provision of transfer progress information on the server. Prepare and start the Ajax communication and the data transfer by a CtrlO method.

Step I.3 – Server: Initialize the progress measurement and provide progress data in the $_SESSION array.

Step I.4 – Client/Server: Initiate a sequence of Ajax polling jobs via a JS time loop for reading the progress information on the server. Handle the Ajax response of each polling job in separate defined methods of a special CtrlO. React to error situations and stop the polling job time loop in case of errors or when the file transfer has finalized.

Step I.5 – Server: Extract, expand and save the CSV files from the Zip-container into a special upload folder on the server. This is done by using standard methods of the PHP ZIP class. Define/Suggest a sequence of imports of the data contents of the different files into file specific database tables. This defined sequence may be controlled via an array (“DB Import Control array” = DBIC-array ) which is kept and updated in the $_SESSION array on the server AND which is also sent back via a JSON object to the client.

Step I.6 – Server: Prepare and send an Ajax response in form of a JSON object to the client with affirmation messages about which CSV files have been received, the name of the files and the order in which they shall be processed. Include error messages and system messages if necessary. The JSON object shall contain the “DBIC”-array.

Step I.7 – Client: Analyze the Ajax response. Display success and error information. Display the number and name of files to be processed afterwards. Stop the time loop for polling jobs.

Phase II – Database import of a file in the pipeline

Step II.1 – Client: Prepare and start a new Ajax job with some parameters. The PHP target program of this job shall import the data of one of the already transferred CSV files. Among other things it should be defined, which file shall be processed (= imported into its associated database table) next. This parameter can follow the suggested order of the array which came from the server at the end of Phase I. All parameters can be set up in a separate (hidden) form with hidden input fields. Submit the Ajax job.

Step II.2 – Server: Start the database import on the server with a flexible PHP program. For small and medium sized files (up to approx. below 500000 lines) do it line by line by appropriate special PHP standard methods for handling CSV files. Check the data of each line where reasonable. Gather at least 20 lines in one INSERT statement to accelerate the import process. Write intermediate progress information into a $_SESSION array or a special database table. (This status information may be read by “polling jobs” started on the client.)
For huge files you may extend the import methods later on by using the special MySQL
command “LOAD DATA INFILE”.

Step II.3 – Client: Launch a sequence of status information polling jobs via a time loop. Handle the return information of each job in separate methods.

Step II.3 – Server: After a successful import of a defined file remove the file from its upload directory (delete it or move it somewhere else, e.g. in a history directory for uploads). Update you Control Array for the sequence of uploads with the following info: Which of the original files have been loaded? Which had errors? Which are still unprocessed? Prepare a JSON object for an Ajax respond (including the upload Control Array). Send it back to the client.

Step II.4 – Client: Analyze the server’s response. Stop any polling jobs issued after the previous submit. Continue with displaying information of the success of the database import of the handled file. Determine the next file to load. Continue with the elements of Step 5 described above.

Phases III + n – Client/Server – Loop:

Cycle through a sequence of Steps described under II.1 to II.4 for further phases until all files are processed or an error has occurred.

Enough for today. In the next article

CSV file upload with Zip containers, jQuery, Ajax and PHP 5.4 progress tracking – II

we shall cover some major elements of Phase I.

Elipse Luna, JSDT JQuery – Code Assist-/Autocompletion-Problem – reduzierte Liste an jQuery Methoden ?

Wir entwickeln für einen Kunden z.Z. größere PHP und jQuery-lastige Projeke mit dynamischen Ajax basierten Web Interfaces. Unsere IDE ist Eclipse (in der Luna Version inkl. PDT, JSDT, Aptana Plugin). Unser Projekt nimmt dabei Bezug auf ältere Projekte und bindet über Links Verzeichnisse aus diesen Projekten in den Build Pfad des aktuellen Projektes ein.

In den letzten Wochen hat mich ein Problem genervt, dass ich nicht auf Anhieb lösen konnte: Um mit Javascript und jQuery effizient arbeiten zu können, nutzen wir JSDT

JavaScript Development Tools	1.6.100.v201410221502	org.eclipse.wst.jsdt.feature.feature.group	Eclipse Web Tools

und JSDT jQuery

JSDT jQuery Integration 1.7.0	org.eclipselabs.jsdt.jquery_feature.feature.group

um während der Javascript-Entwicklung u.a. auf jQuery-bezogene Code Assist und Autocompletion Hinweise im JSDT Javascript-Editor zugreifen zu können.

Zunm Setup siehe z.B.:
https://code.google.com/ a/ eclipselabs.org/ p/ jsdt-jquery/ wiki/ Installation
oder
http://www.htmlgoodies.com/ html5/ javascript/ add-external-js-libraries-to-eclipse-jsdt-driven-projects.html# fbid=PCl6TfPGIe5
und dort den Abschnitt “Adding a JS Object Model Plugin”.

Ein wichtiger Schritt zum jQuery Code Assisting ist, dass man die gewünschte aktuelle Version der jQuery-Definitions-Bibliothek in sein Projekt einbindet. Dies geschieht – wie in den obigen Artikeln beschrieben – über einen Eclipse Konfigurationsdialog zu den Javascript-Bibliotheken des aktuellen Projektes.

In neu angelegten Projekten mit kombinierten “PHP/JSDT Natures” oder “Faceted Natures” funktionierte das Code Assisting im JSDT eigenen Javascript Editor auch prima. Es wurde z.B. ein komplette Liste aller verfügbarer Methoden des jQuery Objekts angezeigt – je nach geladener Version der jQuery Library.

In meinem eigentlichen Haupt-Projekt mit seinen Verlinkungen in Bereiche anderer Projekte wurde beim Code Assisting dagegen nur eine sehr stark reduzierte Liste von Methoden des “jQuery”-Objekts angezeigt.

Das ist beim Entwickeln total nervig. Ich wich in solchen Fällen auf entsprechende Funktionalitäten des Apatana Plugins und dess JS-Editor aus – obwohl ich den (im Gegensatz zum Aptana HTML-Editor) nicht mag.

Zudem führte ein Rebuild meines Projektes nach einem “Clean” zu Abbrüchen mit (Java-NullPointer-) Fehlern des im Build-Verlaufs ausgeführten JS Validators.

Ich habe zwischenzeitlich mehrere Versuche unternommen, mein Projekt (und auch abhängige Projekte) bzgl. ihrer Natures und Facetten neu aufzubauen. Vergeblich. Die Code-Assist Länge wurde nicht besser. Auch ein Vergleich der Projekt-Einstellungen auf Eclipse-Ebene brachte nichts. Natürlich habe ich auch alle Versionen der geladenen Javascript-Bibliotheken (u.a. der konfigurierten jQuery-Definitionsbibliothek) abgeglichen. Das brachte alles keinen Erfolg.

Interessanterweise kam eine vollständige Liste an jQuery Methoden, wenn man unter den geladenen Javascript Libraries die
ECMA 3 Browser Support Library” im entsprechenden Javascript Konfigurationsdialog des Projektes entfernte. Eine vollständige Liste kam im Javascript Code Assisting auch dann, wenn man die JS-Unterstützung im Eclipse Projekt dadurch deaktivierte, dass man die JSDT Nature des Projektes entfernte: Dann taucht der Javascript-Validator nicht mehr unter den aktiven Validatoren des Projektes auf und wird demnach auch nicht benutzt.

Hieraus ergab sich, dass mein Problem mit dem JS Validator und seiner Prüfung vorhandener JS-Dateien zusammenhängen musste. Das brachte mich heute endlich auf die richtige Spur:

In meinen älteren Projekten gab es Verzeichnisse, in denen ich neben eigenen JS-Dateien etliche alte Versionen der jQuery-Bibliotheks- und Definitions-Dateien hinterlegt hatte. Z.B. jquery-1.4.2.min.js oder noch ältere Varianten.

Unglücklicherweise wurden diese Verzeichnisse durch die Verzeichnis-Verlinkungen Teil des Source- und des Build-Paths des aktuellen Projekts. Die dortigen alten Definitionen wirkten sich offenbar mit Priorität auf den JS-Validator und auch die Code Assist Funktionalität aus. Irgendwie logisch – auch wenn ich die Priorisierung der Validator-Analyse bei mehreren vorhandenen jQuery-Dateien nicht nachvollziehen kann. Dennoch: Mein Fehler ! Verschiedene Bibliotheken, die die Definitionen des jQuery-Objektes unterschiedlich vornehmen, können im Rahmen von Builds und Validierungen nur ins Chaos führen.

Was habe ich gelernt?

Um mit “JSDT jQuery” vernünfig arbeiten zu können, sollte man eine evtl. vorhandene Sammlung alter jQuery-Library-Dateien nicht in den Source und/oder Build Path des laufenden Projekts aufnehmen. Wenn man überhaupt eine jQuery-Definitionsdatei in die eigenen Source Code Verzeichnisse integriert, dann eine, die mit der für das Projekt geladenen Version der jQuery-Bibliothek kompatibel ist.

Seit ich das beherzige, funktionieren das generelle JSDT JS und das JSDT jQuery Code Assisting einwandfrei. Auch die Abstürze beim Clean/Rebuild eines Projektes sind verschwunden.

Viel Spaß weiterhin mit Eclipse und JSDT bei eueren Entwicklungsarbeiten.

 

Character sets and Ajax, PHP, JSON – decode/encode your strings properly!

Ajax and PHP programs run in a more or less complex environment. Very often you want to transfer data from a browser client via Ajax to a PHP server and save them after some manipulation into a MariaDB or MySQL database. As you use Ajax you expect some asynchronous response sent from the PHP sever back at the client. This answer can have a complicated structure and may contain a combination of data from different sources – e.g. the database or from your PHP programs.

If and when all components and interfaces [web pages, Ajax-programs, the web server, files, PHP programs, PHP/MySQL interfaces, MySQL …) are set up for a UTF-8 character encoding you probably will not experience any problems regarding the transfer of POST data to a PHP server by Ajax and further on into a MySQL database via a suitable PHP/MySQL interface. The same would be true for the Ajax response. In this article I shall assume that the Ajax response is expected as a JSON object, which we prepare by using the function json_encode() on the PHP side.

Due to provider restrictions or customer requirements you may not always find such an ideal “utf-8 only” situation where you can control all components. Instead, you may be forced to combine your PHP classes and methods with programs others have developed. E.g., your classes may be included into programs of others. And what you have to or should do with the Ajax data may depend on settings others have already performed in classes which are beyond your control. A simple example where a lack of communication may lead to trouble is the following:

You may find situations where the data transfer from the server side PHP-programs into a MySQL database is pre-configured by a (foreign) class controlling the PHP/MySQL interface for a western character set iso-8859-1 instead of utf-8. Related settings of the MySQL system (SET NAMES) affect the PHP mysql, mysqli and pdo_mysql interfaces for the control program. In such situations the following statement would hold :

If your own classes and methods do not provide data encoded with the expected character set at your PHP/MySQL interface, you may get garbage inside the database. This may in particular lead to classical "Umlaut"-problems for German, French and other languages.

So, as a PHP developer you are prepared to decode the POST or GET data strings of an Ajax request properly before transferring such string data to the database! However, what one sometimes may forget is the following:

You have to encode all data contributing to your Ajax response – which you may deliver in a JSON format to your browser – properly, too. And this encoding may depend on the respective data source or its interface to PHP.

And even worse: For one Ajax request the response data may be fetched from multiple sources – each encoded for a different charset. In case you want to use the JSON format for the response data you probably use the json_encode() function. But this function may react allergic to an offered combination of strings encoded in different charsets! So, a proper and suitable encoding of string data from different sources should be performed before starting the json_encode()-process in your PHP-program ! This requires a complete knowledge and control over the encoding of data from all sources that contribute strings to an Ajax response !

Otherwise, you may never get any (reasonable) result data back to your javascript function handling the Ajax response data. This happened to me lately, when I deployed classes which worked perfectly in a UTF-8 environment on a French LAMP system where the PHP/MySQL interfaces were set up for a latin-1 character set (corresponding to iso-8859-1). Due to proper decoding on the server side Ajax data went correctly into a database –
however, the expected complex response data comprising database data, data from files and programs were not generated at all or incorrectly.

As I found it somewhat difficult to analyze what happened, I provide a short overview over some important steps for such Ajax situations below.

Setting a character set for the PHP/MySQL interface(s)

The character code setting for the PHP/MySQL-connections is performed from the PHP side by issuing a SQL command. For the old interface mysql-interface, e.g., this may look like

$sql_unames = “SET NAMES ‘latin1′”;
mysql_query($sql_unames, $this->db);

Note that this setting for the PHP/MySQL-interfaces has nothing to do with the MySQL character settings for the base, a specific table or a table row! The NAMES settings actually prepares the database for the character set of incoming and outgoing data streams. The transformation of string data to (or from) the character code defined in your database/tables/columns is additionally and internally done inside the MySQL RDBMS.

With such a PHP/MySQL setting you may arrive at situations like the one displayed in the following drawing:

ajax_encoding

In the case sketched above I expect the result data to come back to the server in a JSON format.

Looking at the transfer processes, one of the first questions is: How does or should the Ajax transfer to the server for POST data work with respect to character sets ?

Transfer POST data of Ajax-requests encoded with UTF-8

Normally, when you transfer data for a web form to a server you have to choose between the GET or the POST mechanism. This, of course, is also true for Ajax controlled data transfers. Before starting an Ajax request you have to set up the Ajax environment and objects in your Javascript programs accordingly. But potentially there are more things to configure. Via e.g. jQuery you may define an option regarding the so called “ContentType” for the character encoding of the transfer data, the “type” of the data to be sent to the server and the “dataType” for the structural format of the response data:

$.ajaxSetup( { …..
    ContentType : ‘application/x-www-form-urlencoded; charset=UTF-8’
    type : ‘POST’
    dataType : ‘json’
…});

With the first option you could at least in principle change the charset for the encoding to iso-8859-1. However, I normally refrain from doing so, because it is not compliant with W3C-requirements. The jQuery/Ajax documentation says:

" The W3C XMLHttpRequest specification dictates that the charset is always UTF-8; specifying another charset will not force the browser to change the encoding."
(See: http://api.jquery.com/jquery.ajax/).

Therefore, I use the standard and send POST data in Ajax-requests utf-8 encoded. In our scenario this setting would lead to dramatic consequences on the PHP/MySQL side if you did not properly decode the sent data on the server before saving them into the database.

In case you have used the “SET NAMES” SQL command to activate a latin-1 encoded database connection, you must apply the function utf8_decode() to utf-8 encoded strings in the $_POST-array before you want to save these strings in some database table-
fields!

In case you want to deploy Ajax and PHP codes in an international environment where “SET NAMES” may vary from server to server it is wise to analyze your PHP/MySQL interface settings before deciding whether and how to decode. Therefore, the PHP/MySQL interface settings should be available information for your PHP methods dealing with Ajax data.

Note, that the function utf8_decode() decodes to the iso-8859-1-charset, only. For some cases this may not be sufficient (think of the €-sign !). Then the more general function iconv() is your friend on the PHP side.
See: http://de1.php.net/manual/de/function.iconv.php.

Now, you may think we have gained what we wanted for the “Ajax to database” transfer. Not quite:

The strings you eventually want to save in the database may be composed of substrings coming from different sources – not only from the $_POST array after an Ajax request. So, you need to control where from and in which charset the strings you compose come from. A very simple source is the program itself – but the program files (and/or includes) may have another charset than the $-POST-data! So, the individual strings may require a different de- or en-coding treatment! For that purpose the general “Multibyte String Functions” of PHP may be of help for testing or creating specific encodings. See e.g.: http://php.net/manual/de/function.mb-detect-encoding.php

Do not forget to encode Ajax response data properly!

An Ajax request is answered asynchronously. I often use the JSON format for the response from the server to the browser. It is easy to handle and well suited for Javascript. On the PHP the json_encode() function helps to create the required JSON object from the components of an array. However, the strings combined into a JSON conform Ajax data response object may come from different sources. In my scenario I had to combine data defined

  • in data files,
  • in PHP class definition files,
  • in a MySQL database.

All of these sources may provide the data with a different character encoding. In the most simple case, think about a combination (inclusion) of PHP files which some other developers have encoded in UTF-8 whereas your own files are encoded in iso-8859-1. This may e.g. be due to different standard settings in the Eclipse environments the programmers use.

Or let’s take another more realistic example fitting our scenario above:
Assume you have to work with some strings which contain a German “umlaut” as “ü”, “ö”, “ä” or “ß”. E.g., in your $_POST-array you may have received (via Ajax) some string “München” in W3C compliant UTF-8 format. Now, due to database requirements discussed above you convert the “München” string in $_POST[‘muc’] with

$str_x = utf8_decode($_POST[‘muc’]);

to iso-8859-1 before saving it into the database. Then the correct characters would appear in your database table (a fact which you could check by phpMyAdmin).

However, in some other parts of your your UTF-8 encoded PHP(5) program file (or in included files) you (or some other contributing programmers) may have defined a string variable $str_x that eventually also shall contribute to a JSON formatted Ajax response:

$str_y = “München”;

Sooner or later, you prepare your Ajax response – maybe by something like :

$ay_ajax_response[‘x’] = $str_x;
$ay_ajax_response[‘y’] = $str_y;
$ajax_response = json_encode($ay_ajax_response);
echo $ajax_response;

n
(Of course I oversimplify; you would not use global data but much more sophisticated things … ). In such a situation you may never see your expected response values correctly. Depending on your concrete setup of the Ajax connection in your client Javascript/jQuery program you may not even get anything on the client side. Why? Because the PHP function json_encode() will return “false” ! Reason:

json_encode() expects all input strings to be utf-8 encoded !

But this is not the case for your decoded $str_x in our example! Now, think of string data coming from the database in our scenario:

For the same reason, weird things would also happen if you just retrieved some data from a database without thinking about the encoding of the PHP/MySQL interface. If you had used “SET NAMES” to set the PHP/MySQL interface to latin-1, then retrieved some string data from the base and injected them directly – i.e. without a transformation to utf-8 by utf8_encode() – into your Ajax response you would run into the same trouble as described in the example above. Therefore:

Before using json_encode() make sure that all strings in your input array – from whichever source they may come – are properly encoded in UTF-8 ! Watch out for specific settings for the database connection which may have been set by database handling objects. If your original strings coming from the database are encoded in iso-8859-1 you can use the PHP function ut8_encode() to get proper UTF-8 strings!

Some rules

The scenario and examples discussed above illustrate several important points when working with several sources that may use different charsets. I try to summarize these points as rules :

  • All program files should be written using the same character set encoding. (This rule seems natural but is not always guaranteed if the results of different developer groups have to be combined)
  • You should write your program statements such that you do not rely on some assumed charsets. Investigate the strings you deal with – e.g. with the PHP multibyte string functions “mb_….()” and test them for their (probable) charset.
  • When you actively use “SET NAMES” from your PHP code you should always make this information (i.e. the character code choice) available to the Ajax handling methods of your PHP objects dealing with the Ajax interface. This information is e.g. required to transform the POST input string data of Ajax requests into the right charset expected by your PHP/MySQL-interface.
  • In case of composing strings from different sources align the character enprintcoding over all sources. Relevant sources with different charsets may e.g. be: data files, data bases, POST/GET data, ..
  • In case you have used “SET NAMES” to use some specific character set for your MySQL database connection do not forget to decode properly before saving into the database and to encode data fetched from the base properly into utf-8 if these data shall be part of the Ajax response. Relevant functions for utf-8/iso-8859-1 transformations may be utf8_encode(), utf8_decode and for more general cases iconv().
  • If you use strings in your program that are encoded in some other charset than utf-8, but which shall contribute to your JSON formatted Ajax response, encode all these strings in utf-8 before you apply json_encode() ! Verify that all strings are in UTF8 format before using json_encode().
  • Always check the return value of json_encode() and react properly by something like
    if (json_encode($…) === false
    ) {
    …. error handling code …
    }
  • Last but not least: When designing your classes and methods for the Ajax handling on the PHP side always think about some internal debugging features, because due to restrictions and missing features you may not be able to fully debug variables on the server. You may need extra information in your Ajax response and you may need switches to change from a Ajax controlled situation to a standard synchronous client/server situation where you could directly see echo/print_r – outputs from the server. Take into account that in some situation you may never get the Ajax response to the client …

I hope, these rules may help some people that have to work with jQuery/Ajax, PHP, MySQL and are confronted with more than one character set.

Fallen beim Statuscheck lang laufender PHP-Jobs mit Ajax – IV

Wir setzen mit diesem Beitrag unsere kleine Serie über das Polling von Statusinformationen zu lang laufenden PHP-“RUN”-Jobs auf einem PHP-Web-Server von einem Web-Browser aus fort.

Das “Status-Polling” erfolgt clientseitig mit Hilfe von Ajax-Technologien, über die periodisch CHECKER-Jobs (PHP) auf dem Server gestartet werden, welche spezifische Statusinformationen abfragen, die der RUN-Job während seiner Aktivitäten in einer Datenbank hinterlegt hat. Die Statusinformationen werden per Ajax z.B. als JSON-Objekt zum Browser transferiert und dort in geeigneter Weise angezeigt (z.B. per jQuery-Manipulationen von HTML-Elementen der aktuellen Webseite).

Hierzu hatten wir vorbereitend in folgenden Artikeln einige spezielle Punkte betrachtet. Siehe:
Fallen beim Statuscheck lang laufender PHP-Jobs mit Ajax – I
Fallen beim Statuscheck lang laufender PHP-Jobs mit Ajax – II
Fallen beim Statuscheck lang laufender PHP-Jobs mit Ajax – III

Im ersten Beitrag hatten wir begründet, warum es sinnvoll ist, die Statusinformation in einer Datenbank und nicht in einem PHP-SESSION-Objekt zu hinterlegen. Im zweiten Beitrag dieser Serie hatten wir bereits andiskutiert, dass sowohl der der langlaufende “RUN”-Job als auch die periodisch zu startenden “Checker”-Jobs, die die hinterlegten Statusinformationen zum laufenden “RUN”-Job vom Server “pollen”, von 2 getrennten Formularen ein und derselben Webseite aus über Ajax-Mechanismen gestartet werden. Ferner werden Anzeigebereiche auf der Webseite selbst oder ggf. auch ein per Javascript geöffnetes weiteres Fenster Rückmeldungen und Informationen des RUN-Jobs aufnehmen. Die Statusinformationen werden dagegen in einen definierten Anzeigebereich der Webseite eingesteuert werden.

Zu den Formular – wie auch den Anzeigebereichen der Webseite – definieren wir zur besseren Kapselung unter Javascript “Control-Objekte”, die

  • sowohl die zugeordneten (X)HTML/CSS-Elemente über jQuery-Selektoren, entsprechende Eigenschaften und Methoden,
  • aber auch mehr oder weniger abstrakte innere Verarbeitungsfunktionen für Ajax-Transaktionen und Daten
  • sowie weitere benötigte Datenaufbereitungsfunktionalität

über interne Eigenschaften und Methoden repräsentieren.

Diese Control-Objects kapseln und steuern u.a. die jeweils erforderlichen Ajax-Transaktionen und legen entsprechende Eigenschaften für das XMLHttpRequest-Objekt fest. Wir hatten ferner darauf hingewiesen, dass man bzgl. des Kontextes/Scopes des “this”-Operators bei der Definition der Methoden der Control-Objekete sehr genau aufpassen muss. Bei Einsatz von jQuery hat sich diesbezüglich die Verwendung der $.proxy()-Funktionalität zum Erzwingen des gewünschten Kontextes als sehr hilfreich erwiesen.

Skizzenhafte Übersicht über das Zusammenspiel der Formulare und Jobs

Das Verhältnis zwischen RUN-Job und CHECKER-Job stellt sich wie folgt dar:

Run_Checker

Alle blauen Verbindungen zwischen dem Browser Client und dem Server symbolisieren Ajax-Transaktionen zum Server oder zugehörige Antworten vom Server zum Client.

Ein Formular “FR” übernimmt den Start des RUN-Jobs auf dem Server und übergibt diesem Job Parameter. Zu besagtem Formular gibt es ein Javascript-Control-Objekt “Ctrl_Run“, das die Steuerung des Submit-Prozesss über eine eigene Methode und Ajax-Funktionalitäten von jQuery übernimmt. Dieses Control-Objekt erzeugt außerdem ein neues Browser-Fenster, auf dessen Handler sich danach das Form-Attribut “target” beziehen wird. Entweder wird dieses Attribut bereits in der HTML-Form-Definition definiert oder rechtzeitig vor dem Form-Submit per jQuery gesetzt. Die direkten z.B. per “echo” oder “print/printf” erzeugten Ausgaben des RUN-Jobs erscheinen dann in diesem (Sub-) Fenster des Browsers.

Beim Submit des “FR“-Formulars wird primär der RUN-Job gestartet. Zu beachten ist aber, dass die zugehörige “Ctrl_Run“-Methode über eine spezielle Methode eines weiteren Control-Objekts “Ctrl_Check” zum Formular “FC” auch einen “Timer”-Prozess (Loop) startet, der dann wiederum periodisch den Start eines CHECKER-Jobs auslöst. Hierauf kommen wir gleich zurück.

Man beachte, dass der Start-Button im Formular “FC” mehr symbolisch für einen Submit-Event dieses Formulars steht. Der Submit-Event kann per Javascript natürlich mit einer Methode des Kontroll-Objekts verbunden werden. Dies hatten wir im letzten Beitrag diskutiert.

Der einmal gestartete Run-Job schreibt seinen direkten Output in das dafür vorgesehen Fenster. Der RUN-Job liefert aber auch – eher später als früher – eine hoffentlich positive Ajax-Antwort zurück, für die das Control-Objekt “Ctrl_Run” Verantwortung übernehmen muss. U.a. muss spätestens dann der Timer für das periodische Starten der Checker-Jobs beendet werden. Dies kann durch Aufruf einer entsprechenden Methode des “Ctrl_Check“-Objekts erledigt werden (s.u.) (Natürlich sollte zusätzlich ein Stopp des Timers nach Ablauf eines maximal zugestandenen Zeitintervals vorgesehen werden). Ferner hinterlegt der RUN-Job Informationen zu seinem Zustand in einer dafür vorgesehenen Datenbank-Tabelle (s. den ersten Beitrag der Serie).

Genau diese Status-Informationen werden durch den über Ajax periodisch gestarteten “CHECKER”-Job per SQL abgefragt und z.B. als JSON-Objekt im Rahmen der Ajax-Antwort an den Browser-Client zurück übertragen. Das Control-Objekt für die CHECKER-Jobs stellt die ermittelte Status-Information dann in einem geeigneten HTML-Objekt (z.B. DIV) dar, das ggf. systematisch gescrollt werden muss – soweit es dies nicht selbst bei Füllen mit neuem HTML-Inhalt macht.

Bzgl. der Control-Objects beachten wir die im letzten Beitrag gemachten Ausführungen zum Scope des “this”-Operators.

Die stark vereinfachte Code-Darstellung des letzten Beitrages zeigt, wie die Control-Objekte prinzipiell aufgebaut sein müssen. Das Interessante an unserem Szenario ist, dass wir dabei parallel mit zwei Formularen und (mindestens) 2 entsprechenden Control-Objekten arbeiten. Im Fall des “Ctrl_Check“-Objekts müssen wir nun noch ein periodisches Starten des CHECKER-Jobs auf dem Server gewährleisten.

“this”, setInterval() und das Control-Objekt für das “CHECKER”-Formular

Um den CHECKER-Job periodisch über Ajax anzustoßen, können wir z.B. die Javascript-Funktion “setInterval()” oder innerhalb von Loop-Strukturen auch “setTimeout()” benutzen. Ich betrachte hier nur “setInterval()”. Diese Funktion des globalen “window”-Objektes nimmt als ersten Parameter die Bezeichnung einer (Callback-) Funktion auf, als zweiten Parameter die numerische Angabe eines Zeitintervalls in Millisekunden.

Folgen wir nun unserer früher propagierten Philosophie, dass Methoden eines Control-Objekts “Ctrl_Check” die Steuerung aller (Ajax-) Vorgänge im Zusammenhang
mit dem CHECKER-Prozess übernehmen sollen, so müssen wir

  • einerseits “setInterval(“) durch eine Methode eben dieses Kontrollobjekts aufrufen und
  • andererseits als Callback-Funktion bei der Parametrierung von setInterval() eine per “protoype”-Anweisung definierte Funktion/Methode des Control-Objekts selbst angeben.

Nun könnte man versucht sein, in Anlehnung an die Erkenntnisse des letzten Beitrags Code von ähnlicher Form wie folgender einzusetzen:

Falscher Code:

C_C = new Ctrl_Check_Obj(); 
C_C.startTimer(); 

function Ctrl_Check_Obj() {
	this.interval = 400; 
	this.num_int = 0; 
	this.max_num_int = 200;
 	...
 	this.id_status_form = "# ...."; 
	...
}

Ctrl_Check_Obj.prototype.startTimer = function () {
	this.timex = setInterval(this.submitChecker, this.interval); 
	....
};

Ctrl_Check_Obj.prototype.submitChecker = function(e) {
		
	e.preventDefault();
	....
	// Count nuber of intervals - if larger limit => stop timer 
	this.num_int++;
	if (this.num_int > this.max_num_int ) { 
		this.stopTimer(); 
	}
	....
	....
	// Prepare Ajax transaction 	
	var url = $(this.id_status_form).attr('action'); // in "action" ist der PHP-CHECKER-Job definiert !!!
	var form_data = $(this.id_status_form).serialize(); 
	var return_data_type = 'json'; 
	......
	$.ajaxSetup({
		contentType: "application/x-www-form-urlencoded; charset=ISO-8859-1",
		context:  this, 
		error: this.status_error, 
		.......
	});
	.....
	.....
	// Perform an Ajax transaction	
	$.post(url, form_data, this.status_response, return_data_type); 	
	.......		
	.......  				
};

Ctrl_Check_Obj.prototype.status_response = function(status_result) {
	// Do something with the Ajax (Json) response 
	.....
	this.msg = status_result.msg;
	.....
};

Ctrl_Check_Obj.prototype.stopTimer = function() {
	....	
	clearInterval(this.timex);
};

Das funktioniert so jedoch nicht!

Der Hauptgrund ist der, dass der “this”-Operator der Funktion setInterval() zum Zeitpunkt des Aufrufs der Callback-Funktion auf den Scope des globalen “window”-Objekt verweist – und wieder mal nicht auf den Kontext unseres Control-Objekts. Das ist eigentlich logisch: die Funktion setInterval() muss in Javascript ja völlig unabhängig von bestimmten Objekten realisiert werden. Der einzige konstante Kontext, der sich hierfür anbietet ist der globale. Alles andere erfordert eben entsprechende Zusatzmaßnahmen seitens des Entwicklers.

Der Fehler liegt also in der Definition der setTimer()-Methode – oder besser im der unreflektierten Einsatz von “this”. Wie müssen wir die fehlerhafte Zeile

this.timex = setInterval(this.submitChecker, this.interval);

abändern?

Ein einfacher Ausweg könnte über den globalen Kontext des “window”-Objektes führen. Wir könnten dort globale Funktionen als Callback für setInterval() hinterlegen, die dann wiederum Methoden der definierten Control-Objekte aufrufen. So einfach wollen wir es uns aber nicht machen, denn dadurch würde das Prinzip der Kapselung in Methoden und Variablen unserer Control-Objekten durchbrochen werden.

Der Leser des letzten Beitrags vermutet schon, dass auch hier wieder der “$.proxy()”-Mechanismus von jQuery für eine elegante Lösung zum Einsatz kommen kann. Das ist richtig und sieht dann wie folgt aus:

this.timex = setInterval( $.proxy(this.submitChecker, this), this.interval);

Siehe auch:
http://stackoverflow.com/ questions/ 14608994/ jquery-plugin-scope-with-setinterval

Zu anderen – nicht jQuery-basierten –
Lösungen auf der elementaren Basis von JS-Closures siehe dagegen folgende Artikel:
https://coderwall.com/ p/ 65073w
http://techblog.shaneng.net/ 2005/04/ javascript-setinterval-problem.html

In unserem Fall ergibt sich eine funktionierende Lösung auf der Basis von $.proxy() als :

C_C = new Ctrl_Check_Obj(); 
C_C.startTimer(); 

function Ctrl_Check_Obj() {
	this.interval = 400; 
	this.num_int = 0; 
	this.max_num_int = 200;
 	...
 	this.id_status_form = "# ...."; 
	...
}

Ctrl_Check_Obj.prototype.startTimer = function () {
	this.timex = setInterval( $.proxy(this.submitChecker, this), this.interval);	
	....
};

Ctrl_Check_Obj.prototype.submitChecker = function(e) {
		
	e.preventDefault();
	....
	// Count nuber of intervals - if larger limit => stop timer 
	this.num_int++;
	if (this.num_int > this.max_num_int ) { 
		this.stopTimer(); 
	}
	....
	....
	// Prepare Ajax transaction 	
	var url = $(this.id_status_form).attr('action'); 
	var form_data = $(this.id_status_form).serialize(); 
	var return_data_type = 'json'; 
	......
	$.ajaxSetup({
		contentType: "application/x-www-form-urlencoded; charset=ISO-8859-1",
		context:  this, 
		error: this.status_error, 
		.......
	});
	.....
	.....
	// Perform an Ajax transaction	
	$.post(url, form_data, this.status_response, return_data_type); 	
	.......		
	.......  				
};

Ctrl_Check_Obj.prototype.status_response = function(status_result) {
	// Do something with the Ajax (Json) response 
	.....
	this.msg = status_result.msg;
	.....
};
Ctrl_Check_Obj.prototype.status_response = function(status_result) {
	// Do something with the Ajax (Json) response 
	.....
	this.msg = status_result.msg;
	.....
};
Ctrl_Check_Obj.prototype.stopTimer = function() {
	....	
	clearInterval(this.timex);
};

Man beachte, dass das “this” im Übergabe-Parameter “this.interval” kein Problem darstellt. Der übergebene Parameter wird beim Setup der globalen Funktion setInterval() direkt im aktuellen Kontext der Ctrl_Check-Klasse ausgelesen und zur Konstruktion des Timer-Loops benutzt. Probleme macht nur der Kontext für die Callback-Funktion, die ohne Eingriffe im Scope des “window”-Objekt von Javascript erwartet werden würde.

Die Wahl eines geeigneten Polling-Zeitintervals

Ein kleiner Aspekt verdient noch etwas Beachtung. Das Schreiben der Statusinformation durch den RUN-Job erfordert Zeit. Das Erscheinen neuer Information hängt von der Art der Aufgaben ab, die der RUN-Job sequentiell erledigt. Ferner erfordert auch der Ajax-Transfer über das Netzwerk/Internet Zeit. Weder eine zu kurze noch zu lange Wahl des Polling-Zeitintervalls – im obigen Code entspricht dies der Variable “interval” der Klasse Ctrl_Check_Obj() – ist daher klug. Wählt man “interval” zu kurz, stauen sich ggf. CHECKER-JObs, ohne dass sie in jedem Lauf überhaupt was Neues an Information liefern könnten. Wählt man “interval” dagegen zu lang, so bügelt man gewissermaßen über die Taktung der Aufgaben und des zugehörigen Status des RUN-Jobs hinweg.

Eine vernünftige Wahl des Polling-Intervalls – also der Periode für das Starten der CHECKER-Jobs – ist daher primär von der zeitlichen Untergliederung, der zeitlichen Granularität des RUN-Jobs abhängig und sekündär von Netzwerk-Transfer-Zeiten, die evtl. in der gleichen Größenordnung liegen mögen. In vielen meiner Fälle ist 500 msec ein guter Startwert.

Zusammenfassung

Aus meiner Sicht habe ich hiermit die grundsätzlichen Werkzeuge beleuchtet, die auf der Javascript-Seite – also der Client-Seite für das “RUN/CHECKER”-Szenario zum Einsatz kommen sollten.

Die PHP-Seite ist eher langweilig und erschöpft sich in elementaren Datenbanktransaktionen sowie einem Standard-JSON-Encoding der gesammelten Informationen für den Ajax-Transfer. Das sind aus meiner Sicht elementare Ajax-Dinge, die hier nicht weiter beleuchtet werden müssen. Hingewiesen sei auf den möglichen Einsatz der PHP-Funktion

json_encode($ay_ajax_response);

zur Codierung der Resultate, die etwa in einem assoziativen Array “json_encode($ay_ajax_response)” gesammelt wurden.

Welche Informationen als Statusinformationen in der Datenbank hinterlegt, dann vom CHECKER-Job gelesen und zum Web-Client transportiert sowie schließlich im Web-Browser optisch aufbereitet und angezeigt werden, ist natürlich vom Einsatzzweck des RUN-Jobs abhängig.

Somit beenden wir nun unseren Ausflug bzgl. potentieller Fallen, in die man beim Setup eines RUN/CHECKER-Systems zum Pollen von Statusinformation von einem Web-Client aus über den Zustand eines lang laufenden Server-Jobs stolpern kann. Wir fassen abschließend einige wesentliche Punkte der Beitragsreihe zusammen:

  1. Der lang laufende PHP Server-Job “RUN” sollte seine zwischenzeitlichen Statusinformationen in eine Datenbank-Tabelle und nicht in ein SESSION-Objekt schreiben.
  2. Das Starten und die Ajax-Transaktionen für den RUN-Job und die CHECKER-Jobs können über zwei Formulare einer Webseite und parallel abgewickelt werden. Die Kontrolle der Transaktionen übernehmen “Control-Objekte“, die über Methoden (prototype-Funktionen) die Ajax-Umgebung und die Callbacks für die Response/Error-Behandlung definieren.
  3. Bei der Kapselung der Ajax-Response/Error-Behandlung in Methoden der Control-Objects ist der Scope/Kontext für den “this”-Operator zu beachten. Der Einsatz der $.proxy()-Funktionalität von jQuery hilft hier, schnell, elegant und ohne explizite Ausformulierung von Closures zum Ziel zu kommen.
  4. Auch beim der Steuerung des periodischen Starten der CHECKER-Jobs mittels Methoden eines geeigneten Control-Objects und setInterval() hilft $.proxy() bei der Kapselung der periodischen Ajax-Transaktionen bzgl. CHECKER im Kontext des zuständigen Control-Objects.
  5. Das Zeitintervall für das periodische Starten der CHECKER-Jobs muss an die zeitliche Granularität der Aufgabnebehandlung im RUN-Job und an evtl. Netzwerk-Latenzen angepasst werden.

Viel Spaß nun mit der Überwachung des Status von lang laufenden PHP-Jobs von einem Web-Client aus.

Hingewiesen sei abschließend darauf, dass die gesamte Methodik natürlich auch viel allgemeinerer Weise dazu benutzt werden kann, um mehrere Jobs eines Web-Servers von einem Web-Client aus zu starten und zu überwachen. Dies ist auch deswegen interessant, weil ein evtl. gewünschtes Threading von PHP-Jobs spezielle Maßnahmen auf dem Server erfordern. Manchmal ist es viel einfacher Ajax auf dem Client einzusetzen, um mehrere Jobs auf dem Server zu starten und zu kontrollieren. Ein ggf. erforderlicher Informationsaustausch zwischen den laufenden Jobs lässt sich dabei in vielen über die Datenbank erledigen.