Big Data related terms (2)

Source: Internet
Author: User

27.LDB: Local Database
A local database is a database that resides on the machine where the client application is running. The local database provides the fastest corresponding time. Because there is no network transport on the client.

The local database is located on a local disk or LAN. If several users access the database at the same time, the local database takes a resist to the file lock policy. Therefore, the local database is called a file-based database. Typical are paradox, dbasep, FoxPro, and Access.

28.DAQ (Data Acquisition): Data acquisition
Data acquisition is the process of transferring the various parameters of the object under test to the controller through a variety of sensors for proper conversion, through signal conditioning, sampling, quantization, encoding, transmission and other steps.
The data acquisition process of all kinds of data acquisition system is basically the same, including several steps in general:
1. Use the sensor to feel a variety of physical quantities, and convert them into electrical signals;
2. Through A/D conversion, the analog data is transformed into digital data;
3. Recording of data, printout or disk file;
Data acquisition procedures used in various data acquisition systems are:

Large-scale special procedures, fixed-line acquisition program (small-scale dedicated system), user-compiled software tools for the collection program (modular system)

29.data Model: Data Models

A data model is an abstraction that shows the characteristics of a world data, describing the cover and definition of a set of data. Data model number The way data is stored in the suffering is the basis of database system. In the database, the physical structure of data becomes the storage structure of data, the representation and configuration of data elements in computer memory, and the logical structure of data refers to the logical relationship between data elements, which is the representation of data in front of users or programmers, and the storage structure of data is not necessarily consistent with the logical structure.

30. Paradigm (Database terminology)
Paradigm is a set of relational patterns that conform to a certain level, and relationships in relational databases must meet certain requirements and meet different requirements for different paradigms.

The current paradigm is: The first paradigm, the second paradigm, the third paradigm, the BC Paradigm, the fourth paradigm, and the five paradigm.

31: Data compression (compression):
Data compression is and may be less digital to represent the signal sent by the source, less to accommodate a given message set or data sampling set of signal space. The signal space here is the compressed object, which refers to the time domain, the airspace and the frequency domain occupied by a certain signal set. The signal space of these forms is the fairy child Crown beam, the reduction of storage space, which means that the signal transmission efficiency increases, the savings of the bandwidth. As long as some way to reduce a signal space, you can compress the data.

Data compression is one of the most important concepts in information theory. From the perspective of information theory, one of the main purposes of source coding is to solve the problem of data compression. This is reflected throughout the communication process.

32. Data Recovery (recovery)

Data recovery refers to the restoration of data that is retained on the media for a variety of reasons that result in data loss. Timely data is deleted or the hard drive fails, and the data can be recovered without any serious damage to the media. Data loss due to formatting or accidental deletion, most of the data is not corrupted. The data can be reread as long as the connection is re-restored via software. If the hard drive is inaccessible due to hardware damage, you can recover the data as long as the faulty part has been replaced. However, when the media is severely damaged or the data is overwritten, the data is extremely difficult to recover.

33. Data Integration (integtation)

Data integration is about data, logical, or physical integration into a set of agreed data in a number of disparate data sources. The core task of data integration is the integration of interconnected distributed heterogeneous data sources that enable users to access these data sources in a transparent manner. Integration is to maintain data consistency on the whole data source, improve the efficiency of information sharing and utilization; Transparent means that users do not need to be involved in how to achieve access to heterogeneous data source data, but only how to access what data. The system that realizes the data integration becomes the data integration System. He provides users with a unified data source access interface to execute user requests for access to the data source.

34. Data migration (migration)

Data migration is a key part of ensuring the smooth upgrade and update of the system in Data system integration. In the process of information construction, with the development of technology, the original information system has been replaced by more powerful new system, from two-layer structure to three-layer structure, from C/s to B/s. In the process of switching between old and new systems, it is necessary to face a problem of data migration.

35. DataSource (data Element)

The data element, which is the data elements, is a unit of data that is described by a series of attributes, such as definition, identification, representation and allowable value, and in a certain context, constructs a kind of information unit with semantic correctness, independent and without the specific conceptual semantics of I instrument. The data element can be understood as the basic unit of the data, and a number of related alienation are formed in a certain order as a whole structure, i.e. the data model.

36. Data redundancy (redundancy)

Data redundancy means that the same data is repeated repeatedly in the system. In the file system, because there is no connection between the files, sometimes a data appears in multiple files, while the database system overcomes the defect of the file system, but the data redundancy problem still exists. The purpose of eliminating data redundancy is to avoid problems that may arise when updating, in order to maintain data consistency.

37. Data extraction

Data extraction is the process of extracting data from a data source in total. Data extraction refers to the data that is needed to extract the destination data source system from the source data source system. In the practical application, the relational database is used more.

38. Data Normalization (standardization)

Data standardization refers to the process of research, development and application of uniform data classification, grading, recording format, conversion, coding and other technical standards.

39. Data Backup
Data backup is the activity of copying a file or database from the original storage place to another place, which is designed to protect data in the event of a failure or other threat of data security, minimizing the extent to which data is compromised. The process of retrieving the original backup file is called data recovery.
1. Full backup
The advantage of this backup strategy is that when a data loss disaster occurs, you can Xu Su recover the lost data.
Insufficient: A full backup of the entire system is performed daily, resulting in a large duplication of data from the backup. For users with a busy business and limited backup time, it is unwise to choose this strategy
2. Incremental backups (Incremental backup)
Make a full backup, then backup the current new or modified data, save disk space and shorten backup time; The disadvantage is that when disaster occurs, data recovery is cumbersome and backup reliability is poor.
3. Differential backups (differential backup)

Make a full system backup, and then back up all data that is different from the backup to disk on that day. Avoids the above two kinds of side Luo's flaw, has has all its merit. First, it eliminates the need to make a full backup of the system every day, thus saving time and disk space. Second, disaster recovery is also convenient, and once a problem occurs, the user simply needs to use a full backup and a backup of the day before the problem to recover the system.

40. Greedy algorithm (greedy algorithm)
Greedy algorithm refers to the problem when solving, always make the best choice at present. In other words, not considering the overall optimality, what he has done is only the local optimal solution in a sense.

Greedy algorithm is not all problems can get the overall optimal solution, the key is the choice of greedy strategy, the choice of greedy strategy must have no effect, that is, a state of the previous process will not affect the future state, only related to the current state.

41. Divide and conquer law (Divide and Conquer)

Division in Computer science is a very important algorithm, divide and conquer. is to divide a complex problem into two or more identical or similar sub-problems, and then divide the problem into smaller sub-problems. Knowing the last sub-problem can be solved simply and directly. The solution of the original problem is the merger of the solution of the sub-problem. This technique is the basis for many efficient algorithms (sorting algorithm, Fourier transform)

42. Dynamic Planning (programming)

Dynamic programming is a branch of operational research, and it is a mathematical method to solve the optimization of decision-making process. The multi-stage process is transformed into a series of single-stage problems, which are solved by the relationship between the stages.

43. Iterative methods (iterative method)

The iterative method, also known as the method of tossing and falling, is a process of recursively pushing the mind with the old values of variables. Iterations are divided into precise iterations and golden filament iterations. "Dichotomy" and "Newton iterative Method" belong to the approximate iterative method. Iterative algorithm is the basic method to solve the problem of computer. The use of computer speed block, suitable for repeating operation characteristics, so that the computer to a set of specified repeated execution. At each execution, the new value of the variable is rolled out from its original value

44. Branch bounds (branch and bound method)
is a widely used algorithm, the use of this algorithm is very strong, different types of problem solving is different.

Basic idea: Search for all feasible solution spaces for the optimization problem with constrained conditions. When the algorithm is executed, the whole feasible solution space is divided into smaller subsets (branches), and a lower bound or upper bound is computed for the value of the solution within each subset. After each branch, the subsets of the general bounds beyond the known feasible solution are no longer further branched. In this way, many subsets of the solution can be disregarded, thus narrowing the search scope. Until the feasible solution is found, the value of the feasible solution is not much more than the bounds of any subset. Therefore, the algorithm can generally obtain the optimal solution.

45. Cyclotomy (Cyclotomic method)
Using the area of the inner regular polygon to approximate the circle area and to find the method of pi

Big Data related terms (2)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.