Toggle menu
862
3.8K
30.2K
279.1K
Catglobe Wiki
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

Data cache specification module: Difference between revisions

From Catglobe Wiki
jrfconvert import
 
Nguyentanphong (talk | contribs)
Line 32: Line 32:
We actually have the option to create more complex rules for the records in the data table. This concerns the options to take answers from more than one questionnaires’ QASs and place them in the same data table. This is called having primary AND secondary questionnaires in the same data cache. The logic for how many records a data cache will then have is a bit more complex as explained further down.
We actually have the option to create more complex rules for the records in the data table. This concerns the options to take answers from more than one questionnaires’ QASs and place them in the same data table. This is called having primary AND secondary questionnaires in the same data cache. The logic for how many records a data cache will then have is a bit more complex as explained further down.


===When will the data table of a data cache be built?===
=== When will the data table of a data cache be built? ===


When the a user asks the report or export module to get the data from a data cache to use for these two modules, then the data cache manager (the manager of all the data cache in the system) will look to retrieve data.
When the a user asks the report or export module to get the data from a data cache to use for these two modules, then the data cache manager (the manager of all the data cache in the system) will look to retrieve data.  


The first thing the data cache manager will do is to investigate whether the data cache is up to date. This is done by looking at the last updated date and the update frequency. How to set these values we will explain in further help files of setting up your data cache.
The first thing the data cache manager will do is to investigate whether the data cache is up to date. This is done by looking at the last updated date and the update frequency. How to set these values we will explain in further help files of setting up your data cache.  


If the data cache specification is considered up-to-date then:• The data cache manager will try to get it from the short-term data cache.• If the required data table does not exist in the short-term data cache, the data cache manager will try to look for it in the long-term data cache.• If there is no data cached for the data cache specification in the long-term data cache, the building process will be invoked to rebuild the whole data cache specification. • When the building process has been finished, the data table will be put in to the long-term and short-term data cache, and then returned to the client.
If the data cache specification is considered up-to-date then:


If the data cache specification is considered out-of-date, or its update frequency plus its last updated date is greater than the current date time or the data cache specification is set to be auto-updated:• The build process will be invoked to rebuild the data cache specification. • When the building process has been finished, the data table will be put in to the long-term and short-term data cache, and then returned to the client.
• The data cache manager will try to get it from the short-term data cache.


Further, a data cache specification is also not up-to-date if its last updated date is not greater than its modified date.
• If the required data table does not exist in the short-term data cache, the data cache manager will try to look for it in the long-term data cache.


We have two types of builds that are done by the data cache manager.
• If there is no data cached for the data cache specification in the long-term data cache, the building process will be invoked to rebuild the whole data cache specification.


'''Forced build:''' When it is necessary to get the most up-to-date data for the data cache specification, the forced build will be invoked.
• When the building process has been finished, the data table will be put in to the long-term and short-term data cache, and then returned to the client.
 
If the data cache specification is considered out-of-date, or its update frequency plus its last updated date is greater than the current date time or the data cache specification is set to be auto-updated:
 
• The build process will be invoked to rebuild the data cache specification.
 
• When the building process has been finished, the data table will be put in to the long-term and short-term data cache, and then returned to the client.
 
Further, a data cache specification is also not up-to-date if its last updated date is not greater than its modified date.
 
We have two types of builds that are done by the data cache manager.
 
'''Forced build:''' When it is necessary to get the most up-to-date data for the data cache specification, the forced build will be invoked.  


'''Recover build:''' When the last building process of a data cache specification has been broken by some failures in web servers, internet connections or power failures, the recover build process will be invoked. Instead of force build, the building process will continue from the last broken point of the last building process.
'''Recover build:''' When the last building process of a data cache specification has been broken by some failures in web servers, internet connections or power failures, the recover build process will be invoked. Instead of force build, the building process will continue from the last broken point of the last building process.

Revision as of 08:45, 16 February 2012



Data cache specification module

A Data Cache Specification is used to collect two types of data.

a) Questionnaire answer sheets (QAS) and the users that filled them in. This data is used to create Answer Sheet Data Caches.

b) Resources and the information linked to these. This data is used to create Resource Data Caches.

Once the data is collected they are placed into data tables. These data tables are then used to make reports or export data.

We are in the following chapters going to explain the logic behind the following 5 processes.

Process 1. Retrieve data from several tables in Microsoft SQL Server database as specified by the data cache.

Process 2. Store retrieved data from Microsoft SQL Server database to long- term data cache (MySQL database). This will save the data which was gotten in one table, making the data a lot easier to work with.

Process 3. Retrieve the data cache specification’s data from Short-term data cache. The short-term data cache will explained below.

Process 4. Retrieve the data cache specification’s data from long-term data cache. The long-term data cache will explained below.

Process 5. Invoke the build process. When and how is explained below.

What is the data cache?

The data cache is a specification in what data we want, where we want it from, and in what form we want to represent it when converted to one table. Data in Catglobe is stored in a rather complex relational setup, making it very hard to work with for most users. What the data cache therefore does, is convert it into a table, where each row represents a users questionnaire answer sheet and each column information regarding the user and his answers for the questionnaire answer sheet in question.

We actually have the option to create more complex rules for the records in the data table. This concerns the options to take answers from more than one questionnaires’ QASs and place them in the same data table. This is called having primary AND secondary questionnaires in the same data cache. The logic for how many records a data cache will then have is a bit more complex as explained further down.

When will the data table of a data cache be built?

When the a user asks the report or export module to get the data from a data cache to use for these two modules, then the data cache manager (the manager of all the data cache in the system) will look to retrieve data.

The first thing the data cache manager will do is to investigate whether the data cache is up to date. This is done by looking at the last updated date and the update frequency. How to set these values we will explain in further help files of setting up your data cache.

If the data cache specification is considered up-to-date then:

• The data cache manager will try to get it from the short-term data cache.

• If the required data table does not exist in the short-term data cache, the data cache manager will try to look for it in the long-term data cache.

• If there is no data cached for the data cache specification in the long-term data cache, the building process will be invoked to rebuild the whole data cache specification.

• When the building process has been finished, the data table will be put in to the long-term and short-term data cache, and then returned to the client.

If the data cache specification is considered out-of-date, or its update frequency plus its last updated date is greater than the current date time or the data cache specification is set to be auto-updated:

• The build process will be invoked to rebuild the data cache specification.

• When the building process has been finished, the data table will be put in to the long-term and short-term data cache, and then returned to the client.

Further, a data cache specification is also not up-to-date if its last updated date is not greater than its modified date.

We have two types of builds that are done by the data cache manager.

Forced build: When it is necessary to get the most up-to-date data for the data cache specification, the forced build will be invoked.

Recover build: When the last building process of a data cache specification has been broken by some failures in web servers, internet connections or power failures, the recover build process will be invoked. Instead of force build, the building process will continue from the last broken point of the last building process.

What does the building process do?

When a data cache specification is rebuilt, the building process will retrieve data from the tables in the Microsoft SQL Server database as specified by the data cache and store them into the long-term data cache for later use.

If there is only one project questionnaire used inside the data cache specification, the building process will retrieve QASs’ ids and user ids which belong to the primary project questionnaire and satisfy the conditions stated by the data cache specification.

If there is more than one questionnaire used, the building process will create and execute a left outer join query (joining by user id) in which the primary questionnaire is on the left side and other questionnaires are on the right side in order to retrieve QASs’ ids and user ids satisfying the selection rules of the data cache from the Microsoft SQL Server database.

For each column type, the building process will retrieve the corresponding data for columns belonging to that type from the Microsoft SQL Server database and insert them into the temporary cache and then commit the recent cache to the long-term data cache.

Why do we have both a short term and long term data cache?

The reason we both a short and long term data cache is to increase speed. The long term data cache makes it possible for us to store the result of a data cache rebuild and thus avoid costly rebuilds if nothing new has happened to the data of a data cache and we just want to use the prior created data table. The short term data cache places the data in the server memory in order to minimize the time it takes to get data out of the long term data cache, since users often use the data tables continuously over shorter periods of time. Data in the short term data cache does therefore not stay there for more than 15 minutes before it is cleared.