A novel database management logic designed for some important production stages of farm to table


Thèse de Doctorat, 2016

116 Pages, Note: 5


Extrait


Table of content

List of abbreviations

VITA

Acknowledgement

ABSTRACT OF THE DISSERTATION

1. Introduction
1.1. Comparison of different database models
1.2. Rivarly between NoSQL and SQL
1.3. Knowledge-based systems development
1.4. Joker Tao
1.5. Comparison of JT and ontological modeling
1.6. Mathematical definitions of the relational database model and JT database model
1.6.1. Mathematical definition of the relational database model
1.6.2. Mathematical definition of the JT database model
1.7. State of the art
1.7.1. Relational systems
1.7.2. NoSQL systems
1.8. Objectives of my research and development

2. Methods
2.1. Research methods
2.2. The developed data storage and management methods
2.2.1. Four dimension of knowledge stored in data table
2.2.2. The developed JT algorithm exerting method
2.2.3. Virtual data table management
2.2.4. Indexing in JT
2.2.5. Converting data tables to one physical data table
2.3. Storing and processing the unknown, or from the data to the information
2.4. Technological environment
2.4.1. Cloud-based solutions

3. Results and discussion
3.1. Research results
3.2. Development results
3.3. JT database management
3.3.1. JT seed
3.3.2. ID usage
3.3.3. JT attribute
3.3.4. T_sequence and value
3.3.5. The dialectic of entity and attribute
3.3.6. JT Shell
3.3.7. Application Initialization (login) and access controls
3.3.8. Efficiency indicators
3.3.9. Development directions
3.3.10. Subset determination in JT logic

5. Conclusions
5.1. Conclusions of the research results
5.2. Conclusions of the development results

6. New and novel results

Appendix

A1. Converting tables into one physical data table

A2. Creating a view object

A3. JT seed and shell

A4. JT Login

A5. Indexing all attribute

A6. JT functions

A7. JT functions 2

A8. The visualization of an attribute from a virtual data

A9. A data table structure of the variable assignment

A10. Mechanism enclosed in program lines, or operation of the variable assignment

A11. Database management and controls of the JT framework
Basic controls
Domain controls
Other classes
Prototype
Database management

A. 12. Attributes stored in JT table

B. A.13. JT physical data table

References

List of abbreviations

Abbildung in dieser Leseprobe nicht enthalten

VITA

Abbildung in dieser Leseprobe nicht enthalten

Acknowledgement

The present dissertation would not have been a success without the great support accorded by various individuals and institutions. I am thankful to my supervisor Prof. Dr. Gerhard Berchtold, who offered significant help. I am thankful to my external supervisor Prof. Dr. Alice Fialowski, who supervised the mathematical proofs and the software developments.

I am thankful to my dad György Mátyás, the idea owner of the first JT framework. The first version of JT is a Hungarian product which was developed in 2008 (R.number: INNO-1-2008-0015 MFB-00897/2008) thanks to an INNOCSEK European Union application. I am thankful to the Call Tec Consulting Ltd. (the first one with highest Oracle certification in Hungary) and the first one that validated JT. I thank Oracle Apexhosting(.hu), COOP Ltd., and Norpan Ltd. for testing and using our softwares. I am thankful to Prof. Dr. János Kátai and Prof. Dr. Kazuyuki Inubushi for their support in soil science. I thank each member of University of Debrecen, Debrecen, Hungary, and Chiba University, Matsudo, Japan who helped me. I thank my family for their tireless support and encouragement during my research work.

ABSTRACT OF THE DISSERTATION

Farm-to-table refers to the stages of food production and is associated with sustainable agriculture. Studies on climate change and greenhouse gas emissions linked to soil cultivation require large amounts of data. These data are mainly inducted from laboratory measuring devices or from various systems including geographical information and other specific systems, which operate with data from different data storage structures. Operating with data from different data storage structures is a challenge for companies and research institutes. Modern database management systems fall into two broad classes: RDBMS and NoSQL. The interoperability between SQL and NoSQL often garners considerable attention in the world of databases, with many solutions to have been proposed to solve this issue. There are some initiations in the literature about converting systems. These systems apply different data storage techniques, but there is no available system that is yet capable of managing data from different storage structures in one physical data table, without several conversions. In software development, migration from a RDBMS to a non-relational model-based system, especially with distinct characteristics, is still a challenge for programmers and database developers. Novel models have to serve new needs. During the development discussed in the present dissertation I focused on creating a data storage logic, which increases the performance of NoSQL data storage concept-based RDBMS database management system.

The primary goal of the present dissertation is introduce a self-developed data storage logic called Joker Tao (JT), which provides the opportunity to store and manage each input data in one (physical) data table while the data storage concept is structured. Introduction of a new entity attribute in any relational database requires modification of the existing data table and the applications that use these data tables. The solution is to increase the virtual data tables while the number of screens remains the same. JT can be defined as a NoSQL engine on an SQL platform that can serve data from different data storage concepts without several conversions. Using the developed database management logic, each attribute needs only one index for indexing in the database. JT allows any data whether entity, attribute, data connection or formula, to be stored and managed under just one physical data table. In the JT logic based databases, the entity and the attribute are used interchangeably, so users can expand the database with new attributes either during or after the development process. With JT logic, data storage using one physical data table is ensured in SQL database system for the storage and management of long-term scientific information.

I examined the effects and importance of JT during some production stages of farm to table. Namely, the interoperability between soil and GHG production data from different data storage structures, data relating to food-stock records, billing and payroll in the baking industry are examined in this dissertation. Furthermore, I examined the changes in the number of computer operators after the introduction of JT logic in a bakery company. The input data comes from my own research related to soil carbon cycle (University of Debrecen) and a bakery (Norpan Ltd.) company where the developed JT logic was applied. My dataset was validated by the Hungarian-Japanese intergovernmental S&T cooperation program in Chiba University, Matsudo, Japan and University of Debrecen, Debrecen, Hungary between 2009-2011. The database was extended between 2011-2016 at the Institute of Agricultural Chemistry and Soil Science, University of Debrecen, Debrecen, Hungary. In Hungary, I continued the 30 year plus long-term fertilization experiment in Debrecen-Látókép, Hungary. The aforementioned long-term fertilization experiment was setup in 1983, and my samples were taken in each semester from spring 2014 to spring 2015. In our experiments, I examined the effect of different fertilizer doses (Control, N60P45K45, N120P90K90, N180P120K120, N240P180K180), crop rotation (maize mono, bi- and triculture), and irrigation on some soil parameters related to the carbon cycle.

The experimental results and their statistics suggest that the bi- and triculture influenced higher microbial activity which was reflected in the soil respiration, and microbial biomass carbon (MBC). Results of the spring and autumn samples have comparable tendencices, although, higher soil respiration values can be observed in the autumn samples which is due to the higher moisture content. Moisture levels of 21-24% moisture content can be observed in the autumn samples compared to 19-21% moisture content in the spring samples, although soil samples taken from both seasons have optimum moisture content. Irrigation had a less significant effect on soil respiration. Intensive microbial activity can be observed during both seasons. Both small and small-medium fertilizer doses have a more favorable effect on soil respiration in both spring and autumn. Similar MBC values trends can be observed in crop rotation. Irrigation has statistically proven a positive impact on carbon dioxide production but significantly decreased the MBC values.

With regard to development, the results show that in case of 1000 records or more (positive results) significantly faster queries could be realized with JT model usage in cloud compared to relational databases (in Oracle APEX environment). The results show that from 10000 records the relational model generates slow (more than 1 second) queries in a cloud-based environment while JT can remain within a one second time frame. Using the developed database management logic, each attribute needs only one index for indexing in the database. In summary, the user can develop simply by inputting new records using JT data storage structure. There is no need to use several applications and convertions to communicate between different data storage concepts, saving companies and research institutes time and money. Numerically, using the developed JT logic more than 700 data tables could be stored in one data table (Norpan Ltd.). The system works in cases that use more than one million records (COOP Ltd.). After the JT logic was applied, the number of computer operators decreased by 50% in a bakery company (Norpan Ltd.).

1. Introduction

„Future and even current farmers are experiencing that the managerial tasks for arable farming are shifting to a new paradigm, requiring increased attention to economic viability and the interaction with the surroundings. To this end, an integration of information systems is needed to advise managers of formal instructions, recommended guidelines and documentation requirements for various decision making processes. In the EU funded project FutureFarm, a new model and prototype of a new Farm Information Management System (FMIS) which meets these changing requirements will be developed” (Sorensena et al., 2010). The chemical and other energy inputs, such as cultivation, pesticide and fertilizers has been receiving growing movements directed at lessening its usage (Klepper et al., 1977; Youngberg, 1984; Lockeretz et al., 1984; Edens et al., 1985; Buttel et al., 1986; Wagstaff, 1987). In these lands, only a few or no external inputs are used due to the adoption of regenerative technologies which have substancially improved agricultural yields (Bunch, 1991, 1993; GTZ, 1992; UNDP, 1992; Lobo and Kochendorfer-Lucius, 1992; Krishna, 1993; Shah, 1994; SWCB, 1994; Balbarino and Alcober, 1994; Pretty, 1995).

There has been an increase in the awareness that soil is an important component of the earths biophere stimulated by the interest in evaluating the quality and health of the soil resources (Glanz, 1995; Várallyay, 2005). The capacity of a living soil to manage the ecosystem boundaries and function within natural limits to maintain or enhance air and water quality, sustain plant and animal productivity has been widely defined as soil health (Doran et al., 1996; Doran et al., 1998). The agricultural sustainability is determined by soil quality (Acton and Gregorich, 1995 and Papendick and Parr, 1992) and environmental quality (Pierzynski et al., 1994) which jointly determine plant, animal, and human health (Haberern, 1992). „Scientists make a significant contribution to sustainable land management by translating scientific knowledge and information on soil function into practical tools and approaches by which land managers can assess the sustainability of their management practices” (Bouma, 1997 and Dumanski et al., 1992).

In a ideal sustainable management system, growing attention is paid to the examination of soil greenhouse gas emissions. Apart from water vapour, nitrous oxide (N2O), methane (CH4) and carbon dioxide (CO2) are important greenhouse gases (GHG) contributing 6, 20 and 60% towards global warming, respectively (IPCC, 2007). In the past century there has been an over 0.8 °C increase in global temperatures and it is predicted to have a further increase of 1.1-6.4 degrees over the next century (Peters et al., 2013). The atmospheric concentration of N2O, CH4 and CO2 has increased from pre-industrial levels of 270 and 715 ppb and 280 ppm to about 319 ppm, 1,774 and 379 ppb respectively in 2005 (IPCC, 2007). Though CO2 has a higher concentration when compared to CH4 and N2O, both gases have shown higher global warming potential of 25 and 298 times higher than CO2 respectively (IPCC, 2007). Ravishankara et al.(2009) reported that N2O can also lead to depletion of the stratospheric ozone layer. The principal sources of the CO2 emission is as a result of the soil vegetation and respiration from which this gas is released into the atmosphere (Raich & Schlesinger, 1992). There has been several literatures published which has studied the emission of these GHG with different crop rotations, soil types, fertilizer and irrigation management practices (Cabrera et al., 1994; Glatzel et al., 2004; Kong et al., 2013; Singla & Inubushi, 2014a,b; Singla et al., 2014b, Mátyás et al., 2015a).

Examinations on climate change and greenhouse gas emissions linked to agriculture require large amounts of data. Overwhelmingly, these data are inducted from measuring devices or from different systems like geographical information systems and other specific systems, which operates different data storage structures. Despite the use of modern database technology over the years and further advancement in the technology with the availability of supercomputers, the biggest challenge has been that of processing large amount of data from different data storage methods. Some NoSQL model-based data storage concept assures the possibility for continuous refinement during the research processes. Due to the determination of new variables after the entity identification process (Mátyás et al., 2015b).

It would be useful and time-saving if such huge amount of data from different data storage structures could be stored in one phyicical data table with one platform usage. The task of data mining project generally consist of three essential processes: data collection, data preparation and data modeling (Pyle, 1999; Westerman, 2001). One platform usage can be used in storing different data in one data table provided the generation of such models also ensures its possibilities (Imhoff et al., 2003). The main benefit of this system model is seen when huge amount of data with diffferent types are required to be compiled in one project. Integrated mechanisms ensure that the system stores input data into one physical data table (non-virtual) while giving users the ability to input data as usual (relational) data tables (Mátyás et al., 2015a).

1.1. Comparison of different database models

The database model is a logical structure of a database which determines how the data will be stored, retrieved and updated and what relation exists between data items. There are different data models that have emerged over the years (Pramod et al., 2015).

The hierarchical database model is conceptually very simple but implementation of this model is very complex. This model is very poor at managing huge data and supports limited database integrity and limited structural independency (Pramod et al.,2015). Hierarchical data structures are central in the nested relational model (Abiteboul, 1986; Fisher, 1983; Roth, 1985; Scheck,1982), and have been studied under the name formats (Jacobs, 1982), and complex objects (Abiteboul, 1987; Bancilhon, 1986).

The network database model is conceptually very simple like the hierarchical model but implementation is very complex. This model handles more relationship types and supports more data integrity as compared to hierarchical database model. This model suffers from a lack of structural independence and standards (Pramod et al.,2015). Some network database model-based databases designed to store full descriptions of interactions, molecular complexes and pathways (Bader et al, 2001).

The relational database model works in the concept of set theory of mathematics (Shenoi et al.,). In set theory two-dimensional collection of information is called a relation. The RDBMS provides a very simple way to constructs, access and update a database. This model supports only text and numeric value, it does not supportsabstract data types such as audio, video and geographical information (Codd, 1982, Pramod et al.,2015).

Object-oriented database model combines the concept of object-oriented programming with the database technology to provide an integrated application development system. This model supports abstract data types such as audio, video and geographical information. In OODBMS we can improve the productivity with the help of inheritance; it also supports navigational and associative accessing of information (Dittrich, 1989, Pramod et al.,2015).

Modern database management systems fall into two broad classes: RDBMS and NoSQL. Organizations that collect large amounts of unstructured data pay special attention to turn to non-relational databases, frequently referred to as NoSQL databases. NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware (Leavitt, 2010, Pereira et al., 2015). All systems corresponding to the subclasses of the RDBMS class are based on the relational data model and the SQL query language. In contrast, systems corresponding to the NoSQL subclasses vary widely in terms of data models, query languages, support for database transactions, application programming interfaces, and granularity of security features (Pramod et al.,2015).

Researchers have provided some means for translating a relational data schema into an Entity-Relationship (ER) model (Dumpla et al., 1981; Davis et al., 1988; Navathe et al., 1988; Johannesson et al., 1989; Kalman, 1989; Markowitz, 1990; Wenguang et al.,1991). Navathe and Awong (Navathe et al., 1988) provided a detailed classification of relations and attributes. Classified relations and attributes can be translated into entity types, relationship types and categories of the Entity-Category-Relationship model (Elmasri et al., 1985).

1.2. Rivarly between NoSQL and SQL

Several database concepts exist for storing data, while the most often used is the relational database concept. Relational databases are effective in the storage of data in several specializations. In some rapidly developing areas relational databases often reach their limit of performance (Schmid et. al., 2015). The explosive growth of data has seen cloud computing become more popular recently. Traditional web-based content management systems (CMS), e.g., phpBB, WordPress, and Joomla, store data using relational databases whose main advantage is the strong relationships between data tables. However, regarding the flexibility and the feasibility of parallel processing, cloud computing applies NoSQL databases that can support horizontal scaling to manage big data. Therefore, transforming the existing SQL model based data into the NoSQL environment becomes an necessary issue (Chao-Hsien et al., 2015). In software development, migration from a DBMS to a non-relational model based system, especially with distinct characteristics, is a challenge for programmers and database developers(Leonardo et al., 2015).

The interoperability between SQL and NoSQL often garners a considerable attention in the world of database and many solutions have been proposed to solve this issue. There are some initiations in the literature about converting systems. (Wang et al.,2003). These systems apply different data storage techniques, but there is no available system yet which can manage data from different data storage structures in one phyisical data table without several conversions.

Because of the rivalry between the established RDBMS and new database vendors, there is confusion and hype in the marketplace. There is also disagreement on terminology, taxonomy, and classification of new data management systems. This is further exacerbated by the fast evolving product landscape and vested vendor interests (Raghavan et al., 2014).

The primary goal of this dissertation is introduce a novel self-developed database model which provides an opportunity to store and manage each input data in one (physical) data table while the data storage concept is structured.

1.3. Knowledge-based systems development

In this subchapter, the Protégé project is introduced. The original application was designed to building knowledge-acquisition tools for novel specialized programs in medical planning only. From the mentioned tool, the Protégé system has evolved into a extensible, durable platform for knowledge-based systems (Gennari et al., 2003). The latest version, Protégé-2000 has been used by hundreds of individuals and research teams, can be run on a variety of platforms, supports customized user-interfaces, incorporates the Open Knowledge-Base Connectivity (OKBC) knowledge model, interacts with different standard storage formats such as relational databases, XML, and RDF. The original objective of Protégé project was to reduce the knowledge-acquisition bottleneck (Hayes-Roth et al., 1983) by minimizing the role of the knowledge engineer in constructing knowledge bases. Protégé is neither an expert system itself nor an application which builds expert systems directly; Protégé is a platform that helps users build other tools that are custom-tailored to assist with knowledge-acquisition for expert systems in different specific application areas (Musen, 1989a). The evolution of Protégé started with the Opal development (Musen, 1987), a knowledge-acquisition tool for the Oncocin system (Shortliffe et al., 1981). Opal aimed to improve the traditional expert-system development approach by moving tasks from the knowledge engineers to the domain experts, thereby reducing the likelihood of mistakes and streamlining knowledge base construction (Sandahl, 1994). Following this, as Protégé applications and knowledge bases become greater and more complex, they applied a more componential view of both knowledge bases and problem-solving techniques (Musen and Tu, 1993). Protégé-2000 supports the idea that the labor of knowledge-base construction should be described into: (1) overall ontology construction by a knowledge engineer and then (2) knowledge-base filling-in by a domain expert. „The Protégé effort illustrates the importance of combining a usable implementation for practical applications with a research platform for scientific experimentation. Protégé has undergone several design iterations, which helped the users evaluate Protégé progressively, understand and define the problems of using Protégé, and test new design solutions” (Gennari et al., 2003).

1.4. Joker Tao

In this subchapter I introduce briefly the self-designed database management technology called Joker Tao (JT) which is based on a novel data model. Horizontal column expansion is not used, instead a customized code table was created which allows any data (which can be either an entity, an attribute, an integrity condition or a formula) to be stored and managed even with one physical data table. In this data model, the entity and the attribute are used interchangeably. Physical records with the same ID value form a virtual record. The set of the virtual records with same attribute value of the belonging to the data table mean a virtual data table. The inputted data is managed as entity and attribute at the same time. The categorization depth is determined by the actual task. One of the major problems in database management is indexing. If the developers index huge amount of attributes in a database then the increased number of the index tables may slow down the system or the database development. JT ensures that all attributes in a database can be indexed without using several indexes. In JT there is no normalization in the traditional sense, or no complex key usage. The first physical field (column) is used for the unique identification of virtual entities. Another field (Attribute) is used for the identification of physical records in a virtual entity. JT also handles relational data models differently; it allows records with the same ID values to identify a single entity. Similarly, non-relational models are also novel in JT; the inputted data are not stored in an unstructured concept, making it easier to manage a huge amount of data without creating several applications. When compared to NoSQL models the speed of the JT system lies in its use of vertical data expansion and the elimination of the sequential search approach, which slows queries. JT derives data from the relationship between entities and attributes stored in the system. In several cases the system can develop with the entry of new record. The records can arise both manually or can be triggered by events.

I present a solution which puts forth a whole new approach that eliminates the need for conversion and file compatibility problems by combining the different data storage concepts into a physical data storage level. JT then can defined as, a NoSQL engine on an SQL platform that can serve data from different data storage concepts without several conversions necessary. The greatest strength of the JT system is the fact that each data could be both an entity and an attribute at the same time.

1.5. Comparison of JT and ontological modeling

For ease of understanding the JT database model, I compared JT with Protégé in this subchapter (Table 1). Input parameters are the same in both systems as definitions, objects, relations and descriptions of a concrete field.

Table 1. Comparison of JT and Protégé (own edition)

Abbildung in dieser Leseprobe nicht enthalten

I compared JT with Protégé to demonstrate the differences in data interoperability. While in Protégé the interoperability is limited on different logical levels, in JT the interoperability is ensured by the universal data storage structure behind it. It provides an opportunity for different databases that use JT logic in their data storage structure, to use the same formula and to support each other automatically.

1.6. Mathematical definitions of the relational database model and JT database model

1.6.1. Mathematical definition of the relational database model

A primitive relation scheme is a three-tuple

where

Ω is a finite set of attributes; the attributes are the headings of the columns (fields);

Δ is a finite set if domains; each domain is a set if values which may be infinite

dom: Ω Δ is a function that associates with each attribute a domain. For each attribute, only the values of the corresponding domain may appear in the column that is headed by that attribute.

A relation scheme (or briefly a relation) is a three-tuple

where

PRS is a primitive relation scheme;

M is the meaning of the relation. This is an informal component of the definition, since it refers to the real world and since we will mostly describe the meaning in a human, natural language. In nearly all theoretical studies the M component of a relation scheme has little importance. However, we include it in the definition of a relation scheme it is a fundamental time independent property of the relation;

SC is a set of relation constraints or conditions.

So, a relation scheme is a three-tuple of the first component which is a three-tuple in turn. However, whenever the intermediate step of a primitive relation scheme is not relevant in our discussions, we shall denote a relation conveniently as a five-tuple RS (Paredaens et al., 1989).

1.6.2. Mathematical definition of the JT database model

In the JT logic-based databases, only one data table (called physical data table) is used for storing and defining each data. The traditional sense- data tables (called virtual or logical data tables) are stored and defined in the phsyical data table. The JT records form entitites, virtual data tables and virtual attributes.

A primitive relation scheme is a three-tuple

where

Ωv is a finite set of attributes, in our case, it is the set of entities from the ATTRIBUTES virtual data table.

Δ is a finite set of entities, in our case, it is a set of virtual records.

dom: Ω Δ is a function that associates each attribute to an entity; it can be interpreted as a predefined set of attributes called "1:N registry hive". This function is used to maintain the entities in the virtual data tables

A relation scheme (or briefly a relation) is a three-tuple

where

PRS is a primitive relation scheme;

M is the meaning of the relation. This is an informal component of the definition, since it refers to the real world and since we will describe it using a natural language. SC is a set of relation constraints. From the JT physical data table, the following deifinitions can be read out:

- Virtual record is set of the physical records which have the same ID value.

- Virtual data table is set of the virtual records which have the same value of the belonging to the virtual data table (BVDT) attribute (Mátyás et al., 2016).

1.7. State of the art

The current emergence of a new class of systems for data management has challenged the well-entrenched relational databases. These systems provide several choices for data management under the umbrella term NoSQL (Gudivada et al., 2014). „Collecting, integrating and storing large amounts of information is quickly becoming a necessity among software engineers in industry, as well as by scientists in research settings” (Schram et al., 2012). Making a right choice is critical to building applications that meet business needs. Performance, scalability and cost are the principal business drivers for these new systems. By design, they do not provide all of the relational database features such as transactions, data integrity enforcement, and declarative query language. Modern database management systems fall into two broad classes: RDBMS and NoSQL. All systems corresponding to the subclasses of the RDBMS class are based on the relational data model and the SQL query language. In contrast, systems corresponding to the NoSQL subclasses vary widely in terms of data models, query languages, support for database transactions, application programming interfaces, and granularity of security features (Stonebraker, 2011, Gudivada et al., 2014).

1.7.1. Relational systems

The relational database has been the dominant model for database management (Mukhopadhyay et al., 2010; Du et al., 2008) since it was developed by Edgar Codd in 1970 (Shuxin and Indrakshi, 2005; Nance et al., 2013). High availability in relational systems is achieved through replication and data partitioning on disk systems. The schema in the relational databases devolved from many interrelated fully expressed tables to simple key/value look-up (Padhy et al. 2011). Likewise, enhanced performance can be attained through vertical scaling. However, these approaches are expensive and limited. Schema rigidity and unacceptable query latency are often cited as relational database limitations for Web 2.0 applications. Nonetheless, their revenue will reach $41 billion in 2016 and this represents 93% of the entire database market (Gartner, 2012). Columnoriented RDBMS use a storage model that is optimized for efficient computation of column aggregates to meet the requirements of Online Analytical Processing (OLAP) applications. MonetDB, MonetDB/X100, C-Store, and InfiniDB are examples of column-oriented RDBMS. There are over 80 open source and commercial RDBMS (Solid IT, 2014). Oracle, IBM DB2, Microsoft SQL Server, SAP HANA, Teradata, PostgreSQL, and MySQL are most widely used. They are suitable for applications that require data integrity preservation, authentication and fine granular access control, declarative query languages and transactions (Gudivada et al., 2014).

1.7.2. NoSQL systems

„Relational database is widely used in most of the application to store and retrieve data. They work best when they handle a limited set of data. Handling real time huge volume of data like internet was inefficient in relation database systems (Yi et al., 2010; Peng et al., 2010; Parker et al., 2013). To overcome this problem the NoSQL Database came into existence” (Tauro et al., 2012; Parker et al., 2013).

„Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data” (Grolinger, 2013).

NewSQL systems have evolved from the traditional RDBMS and influenced by NoSQL systems. They are also known as multi-model systems as they support more than one data model. PostgreSQL-SC, VoltDB, VoltCache, ArangoDB, Aerospike, FoundationalDB, and Spanner are representative systems in this category. They differ in system architecture (master-master and multi-level masters), and support for transactions, sharding, replication, native MapReduce, and client access methods. Applications that critically depend on RDBMS functions, but also require some NoSQL features are suitable candidates for using NewSQL (Table 2) (Parker et al., 2013; Gudivada et al., 2014).

The goal of this subchapter is to introduce some widespread database model and not to do an exhaustive survey of NoSQL systems (see Lamb, 2011; Marcus, 2011; Ellis, 2011; Popesau, 2011; Edlich, 2012; Mohan, 2013) for detailed information on them.

„The simplest data stores use a data model similar to the popular memcached distributed in-memory cache, with a single key-value index for all the data” (Cattell, 2010). The value can be anything and is treated as opaque binary data. The key is transformed into an index using a hash function. Partitioning schemes other than hashing are also often used, especially in systems that store values in sorted order on disk. These methods of storing data entails fast retrieval and the query latency is independent of data size (Gudivada et al., 2014). The defining characteristics of key-value databases include real-time processing of Big Data, horizontal scalability, reliability and high availability. There are over 40 key-value systems. Redis, Memcached, Riak, DynamoDB, and Ehcache are more popular. Key-value systems differ widely in functionality and performance. They vary in architecture used (master-master, master-slave, and client-server), in-memory database feature, disk persistence, data structures, transactions, sharding, replication, native MapReduce, and client access methods. Use cases for key-value systems include applications where response time is in the order of milliseconds and the primary query mechanism is key-based lookup (Solid IT, 2014).

„Project Voldemort is an advanced key-value store.” (Cattell, 2010). Voldemort supports optimistic locking for consistent multi-record updates: if updates conflict with any other process, they can be backed out. Vector clocks, as used in Dynamo (Cooper, 2010), provide an ordering on versions. „Voldemort supports automatic sharding of data. Consistent hashing is used to distribute data around a ring of nodes: data hashed to node K is replicated on node K+1 … K+n where n is the desired number of extra copies (often n=1). Using good sharding technique, there should be many more “virtual” nodes than physical nodes (servers). Once data partitioning is set up, its operation is transparent. Nodes can be added or removed from a database cluster, and the system adapts automatically. Voldemort automatically detects and recovers failed nodes.written in Java” (Cattell, 2010).

Such applications include session management for Web applications, configuration management; distributed locks, messaging, personalization of user experience and providing engaging user interactions in social media, mobile platforms, and Internet gaming, (e) real-time bidding and online trading Ad servers, recommendation engines, and multi-channel retailing and eCommerce (Gudivada et al., 2014).

NoSQL databases are highly scalable, have a simplified data model, extraordinarily bare query language, no mechanism for handling consistency and integrity amongst data, and almost no support for security at the database level (Okman et al., 2011; Nance et al., 2013).

Table 2. Comparison of RDMS and NoSQL (Gudivada et al, 2014)

Abbildung in dieser Leseprobe nicht enthalten

1.8. Objectives of my research and development

The main objective of the development was to create a logic which provides universal data storage for cloud-based (Oracle Apex) databases. It means each input data can be interpreted as entity and attribute at the same time in the same data table.

The objectives of the development were to create:

- a database model which ensures one (physical) data table storage for each data that are inputted from relational databases
- a logic that makes virtual data table management and vertical data expansion possible in a RDMS environment
- each (relational) data table can be interpreted as virtual data table using coding system in the physical data table
- each input data can be interpreted as entity and attribute in the same database using one physical data table
- an automatic convertion mechanism which ensures the convertion from usual (relational) data tables to JT logic-based data tables (a NoSQL model based database)
- a shell which handles the one physical data table
- a user interface which manages the shell (and the virtual data tables)
- a demo framework (Java) for non-cloud usage
- new attributes, entities and virtual data tables with only inputting new records Novel models have to serve new needs. During the development I focused on creating a logic which increases the performance of the cloud-based database management systems.

I examined the effects and importance of our own developed database management technology on some production stages of farm to table. Namely, the interoperability between soil and GHG production data from different data storage structures, food-stock records, billing and payroll in baking industry are examined in this dissertation. Furthermore I examined the changes in the number of computer operators after the introduction of our database management technology in a company.

The schematic figure below shows the processes which were completed during my research and development (Fig.1).

Abbildung in dieser Leseprobe nicht enthalten

Fig. 1. Processes of my research and development (own edition)

The inputed data come from my research projects and a bakery (Norpan Ltd.) company where the developed JT logic was applied. My dataset was validated by the Hungarian-Japanese intergovernmental S and T cooperation program in Chiba University, Matsudo, Japan and University of Debrecen, Debrecen, Hungary between 2009-2011. Our database was extended between 2011-2016 at the Institute of Agricultural Chemistry and Soil Science, University of Debrecen, Debrecen, Hungary.

2. Methods

The interoperabiliy between data related to soil properties and GHG production is examined in Chiba University, Matsudo, Japan and the University of Debrecen. The food-stock records supporting methods are introduced in COOP Ltd. The changes in number of computer operators were examined in Norpan Bakery Ltd.

In the present dissertation the greenhouse gases data management is highlighted but I show examples from the developed billing and production optimizing software to demonstrate the efficiency of JT. I applied the JT logic in Magic 9, Oracle 10, Oracle Apex and in our own developed framework (JAVA application).

2.1. Research methods

From the Hungarian side, I continued the the 30 year plus old long-term fertilization experiment at Debrecen-Látókép, Hungary. The aforementioned long-term fertilization experiment was setup in 1983, and my samples were taken from spring 2014 to spring 2015.

Appied methods

In our experiments, I examined the effect of different fertilizer doses, crop rotation, and irrigation on some soil parameters related to the carbon cycle (Table 3).

Table 3. Labels of fertilizer doses in different cultures (own edition)

Abbildung in dieser Leseprobe nicht enthalten

Biculture: wheat, maize;Triculture: pea, wheat, maize

I determined the following physical, chemical and microbiologican soil properties (Table 4).

Table 4. Soil parameters (own edition)

Abbildung in dieser Leseprobe nicht enthalten

Statistical evaluations were completed with SPSS 20.for windows.

I determined the soil moisture content gravimetrically by drying the soil at 105 °C for 24 h according to Klimes-Szmik (1962). The pH H2O and pH MKCl were measured according to Filep (1995). The humus content was determined using potassium dichromate according to Székely (1960). The organic carbon was calculated according to Hargitainé (1995). The soil respiration was measured using alkaline sequestration in 1L bottles according to Witkamp (1995). The microbial biomass carbon was determined by chloroform fumigation-extraction method according to Jenkinson and Powlson (1976).

In the Japanese-Hungarian S&T research the following parameters were determined: The soil moisture content was determined gravimetrically by drying the soil at 105°C for 24 h. We measured the soil pH in water suspensions (soil/water, 1/2.5, w/w) using glass electrode. Electrical conductivity (EC) (soil/water, 1/5, w/w) was also determined by a CM-14P EC meter. The concentrations of NO3 – -N, NH4 + -N, and NO2 – -N were analyzed using a spectrophotometer. Total carbon (TC) and total nitrogen (TN) in air-dried soil were measured by a C/N corder. Soil MBC and MBN were determined with the chloroform fumigation-extraction method according to Vance et al. 1987. Soil nitrate-nitrogen (NO3 – -N), ammoniumnitrogen (NH4 + -N), and nitrite-nitrogen (NO2 – -N) were determined by extracting soil sample of ca. 10 g with 50 mL 1 M potassium chloride (KCl), within 1 week of the sampling date. The soils were pre-incubated at 25°C for 72 h . The soils were fumigated with chloroform at 25°C for 24 h in sealed desiccators. Fumigated and unfumigated soil samples were extracted with 50 mL 0.5 M potassium sulfate (K2SO4) by shaking for 30 min. Concentration of organic C in the K2SO4 extracts was measured by a TOC analyzer (Kong et al., 2013, Mátyás et al., 2015).

Data sites used in the studies

In this subchapter, I present the datasets which were stored according to usual (relational) model firstly and after I transformed them to JT data storage structure. Seven sampling sites from central Japan (34° 49'~36°36' N, 139°00'~54' E) were selected (Fig.2).

Abbildung in dieser Leseprobe nicht enthalten

Fig.2. Sampling sites in Japan

The sampling sites in Japan were located at Atagawa (evergreen broad leaf forest, and orange orchard), Kujuukuri (rice croplands), Matsudo (soybean croplands), Numata (mixed forest, grassland, and apple orchard). Eight sites from eastern Hungary (47°55'~48°12' N, 21°23'~67' E) were selected (Fig. 3).

Abbildung in dieser Leseprobe nicht enthalten

Fig. 3. Sampling sites in Hungary

The sampling sites in Hungary were located at Látókép (croplands); Hortobágy (grassland), Pallag (apple orchard), Dombostanya (grassland), Görbeháza (maize croplands), (Tokaj forest).

Abbildung in dieser Leseprobe nicht enthalten

Fig. 4. The location of Debrecen-Látókép long-term fertilization experiment.

I examined the effect of different fertilizer doses (Control, N60P45K45, N120P90K90, N180P120K120, N240P180K180), crop rotation (maize mono, bi- and triculture), and irrigation on some soil properties related to the carbon cycle.

The physical parameters of soil sampling sites in both countries were taken as: land use, latitude/longitude, altitude, soil orders (FAO taxonomy), annual mean temperature (°C), annual mean precipitation (mm), and the soil moisture (%). The chemical and microbial properties of soils from both countries were taken as: pH (H2O), EC (d S m–1), total C {mg g–1dry soil(ds)}, total N (mg g–1ds), C/N ratio, microbial biomass C (MBC) (μg g–1 ds), microbial biomass N (MBN) (μg g–1 ds), soluble organic C(SOC) (μg g–1 ds), nitrate-N(NO3–-N) (μg g–1 ds), ammonium-N(NH4+-N) (μg g–1 ds), nitrite-N(NO2–-N) (μg g–1 ds), and MBC/ total C (%). The concentrations of CO2, N2O, and CH4 were measured using gas chromatographs (GC-14B, Shimadzu, Japan) equipped with a thermal conductivity detector, electron capture detector and flame ionization detector, respectively (Singla & Inubushi, 2013). Equations in multiple regression models for the relationship between GHG production/consumption potential and the soil properties were stored in the database (Table 5). All the statistical analyses were completed using SPSS Statistics 20 (IBM, New York, USA).

Table 5. Multiple regression models for the relationship between the cumulative greenhouse gas (GHG) production/consumption (Y) and the soil properties (X) (Mátyás et al., 2014)

Abbildung in dieser Leseprobe nicht enthalten

Y1: cumulative carbon dioxide (CO2) production; Y2: cumulative nitrous oxide (N2O) production; Y3: cumulative methane (CH4) consumption; X1: nitritenitrogen (NO2–-N); X2: ammonium nitrogen (NH4+-N); X3: microbial biomass carbon (MBC); X4: microbial biomass nitrogen (MBN); X5: soluble organic carbon (SOC); X6: total carbon (TC); X7: total nitrogen (TN); X8: soil water content; X9: the ratio of MBC to TC; X10: soil carbon:nitrogen (C:N) ratio. #1 to #13 were analyzed by single regression method. #14, #15, and #16 were analyzed by stepwise regression method (Mátyás et al., 2015a).

[...]

Fin de l'extrait de 116 pages

Résumé des informations

Titre
A novel database management logic designed for some important production stages of farm to table
Université
National Autonomous University of Nicaragua
Note
5
Auteur
Année
2016
Pages
116
N° de catalogue
V415404
ISBN (ebook)
9783668693562
ISBN (Livre)
9783668693579
Taille d'un fichier
2972 KB
Langue
anglais
Citation du texte
Bence Mátyás (Auteur), 2016, A novel database management logic designed for some important production stages of farm to table, Munich, GRIN Verlag, https://www.grin.com/document/415404

Commentaires

  • Pas encore de commentaires.
Lire l'ebook
Titre: A novel database management logic designed for some important production stages of farm to table



Télécharger textes

Votre devoir / mémoire:

- Publication en tant qu'eBook et livre
- Honoraires élevés sur les ventes
- Pour vous complètement gratuit - avec ISBN
- Cela dure que 5 minutes
- Chaque œuvre trouve des lecteurs

Devenir un auteur