Definition
Data structure is a collection of data elements with structural characteristics. It studies the logical structure of data and the physical structure of data, as well as the interaction between them Relations, and design the corresponding calculations for this structure definition, and ensure that the new structure obtained after these calculations still maintains the original structure type. In short, a data structure is a collection of data elements that have one or more specific relationships with each other, that is, a collection of data elements with a "structure". "Structure" refers to the relationship that exists between data elements, which is divided into logical structure and storage structure.
The logical structure and physical structure of data are two closely related aspects of the data structure. The same logical structure can correspond to different storage structures. The design of the algorithm depends on the logical structure of the data, and the realization of the algorithm depends on the specified storage structure.
The research content of data structure is the basis for constructing complex software systems, and its core technology is decomposition and abstraction. The data can be divided into three levels through decomposition; and then through abstraction, abandoning the specific content of the data elements, the logical structure is obtained. Similarly, by dividing the processing requirements into various functions, and then by abstracting away the implementation details, the definition of the operation is obtained. The combination of the above two aspects can transform the problem into a data structure. This is a process from concrete (that is, concrete problems) to abstract (that is, data structures). Then, by increasing the consideration of the implementation details, the storage structure and the implementation calculation are further obtained, thereby completing the design task. This is a process from abstraction (that is, data structure) to concrete (that is, concrete realization).
Research object
The logical structure of data
Refer to the data structure that reflects the logical relationship between data elements. The logical relationship is between data elements. The relationship between before and after, and has nothing to do with their storage location in the computer. The logical structure includes:
1. Set: There is no other relationship between the elements in the data structure except for the mutual relationship of "belonging to the same set";
2. Linear structure: The elements in the data structure have a one-to-one relationship;
3. Tree structure: The elements in the data structure have a one-to-many relationship;
4. Graphic structure : The elements in the data structure have a many-to-many relationship.
The physical structure of the data
refers to the storage form of the logical structure of the data in the computer storage space.
The physical structure of data is the representation of the data structure in the computer (also known as the image), which includes the internal representation of data elements and the internal representation of relationships. Since there are multiple specific implementation methods such as sequence, link, index, hash, etc., a data structure can be expressed as one or more storage structures.
In-machine representation of data elements (mapping method): A bit string of binary bits (bit) is used to represent data elements. This kind of bit string is usually called a node. When a data element consists of several data items, the sub-bit string corresponding to each data item in the bit string is called a data field. Therefore, the node is the internal representation (or internal image) of the data element.
In-machine representation of relationships (mapping method): The in-machine representation of relationships between data elements can be divided into sequential images and non-sequential images. Two commonly used storage structures: sequential storage structure and chain storage Structure. Sequence mapping uses the relative position of the elements in the memory to represent the logical relationship between data elements. Non-sequential mapping uses pointers that indicate storage locations of elements to represent logical relationships between data elements.
Data storage structure
The storage form of the logical structure of data in the computer storage space is called the physical structure of the data (also called the storage structure). Generally speaking, the logical structure of a data structure can be expressed as a variety of storage structures as needed. Common storage structures include sequential storage, chain storage, index storage, and hash storage.
The characteristic of the sequential storage structure of data is: the logical relationship between the data elements is expressed by the relative position of the element in the memory; the characteristic of non-sequential storage is: the data is expressed by the pointer indicating the storage address of the element The logical relationship between elements.
Classification
There are many types of data structures. Generally speaking, data are simply classified according to their logical structure, including linear and non-linear structures.
Linear structure
Simply put, a linear structure means that each node in the table has a linear relationship. If described in the language of the data structure, the linear structure should include the following points:
1. The linear structure is a non-empty set.
2. The linear structure has one and only one start node and one end node.
3. All nodes of the linear structure have at most one direct predecessor node and one direct successor node.
Linear tables are typical linear structures, and stacks, queues, and strings are all linear structures.
Nonlinear structure
Simply put, a non-linear structure means that there are multiple correspondences between each node in the table. If described in the language of data structure, the nonlinear structure should include the following points:
1. The nonlinear structure is a non-empty set.
2. A node of a nonlinear structure may have multiple direct predecessor nodes and multiple direct successor nodes.
In practical applications, data structures such as arrays, generalized tables, tree structures, and graph structures are all non-linear structures.
Commonly used data structures
During the development of computer science, data structures have also evolved. Commonly used data structures in program design include the following.
Array (Array)
Array is an aggregate data type, which is a collection of several variables of the same type organized together in an orderly manner. Array can be said to be the most basic data structure, which corresponds to various programming languages. An array can be decomposed into multiple array elements. According to the types of data elements, the array can be divided into integer arrays, character arrays, floating-point arrays, pointer arrays, and structure arrays. Arrays can also be expressed in one-dimensional, two-dimensional, and multi-dimensional forms.
Stack
Stack is a special linear table. It can only insert and delete data nodes on one fixed end of a table. The stack stores data according to the last-in, first-out principle, that is, the first inserted data will be pushed into the bottom of the stack, and the last inserted data will be at the top of the stack. When reading data, it will be read out one by one from the top of the stack. The stack is often used for on-site protection of important data in assembly language programs. When there is no data in the stack, it is called an empty stack.
Queue
The queue is similar to the stack, and it is also a special linear table. Unlike the stack, the queue only allows insert operations on one end of the table and delete operations on the other end. Generally speaking, the end that performs the insertion operation is called the tail of the queue, and the end that performs the delete operation is called the head of the queue. When there are no elements in the queue, it is called an empty queue.
Linked List
Linked list is a data structure in which data elements are stored according to a chained storage structure. This storage structure is physically non-continuous. The linked list is composed of a series of data nodes, and each data node includes two parts: data field and pointer field. Among them, the pointer field holds the address where the next element in the data structure is stored. The logical order of the data elements in the linked list structure is realized by the link order of the pointers in the linked list.
Tree
Tree is a typical nonlinear structure, which is a finite set K including two nodes. In the tree structure, there is one and only one root node, which has no predecessor node. All other nodes in the tree structure have one and only one predecessor node, and there can be two successor nodes, m≥0.
Graph
Graph is another non-linear data structure. In the graph structure, data nodes are generally called vertices, and edges are ordered pairs of vertices. If there is an edge between two vertices, it means that the two vertices have an adjacent relationship.
Heap
Heap is a special tree-shaped data structure, and the heaps generally discussed are binary heaps. The characteristic of the heap is that the value of the root node is the smallest or the largest among all nodes, and the two subtrees of the root node are also a heap structure.
Hash table (Hash)
The hash table is derived from the hash function (Hash function). The idea is that if there is a record with the keyword equal to T in the structure, then it must The record can be found in the storage location of F(T), so that the checked record can be directly obtained without a comparison operation.
Commonly used algorithms
The content of data structure research: how to organize the data according to a certain logical structure, and choose the appropriate storage representation method to store the data organized in the logical structure To the computer's memory. The purpose of algorithm research is to process data more effectively and improve the efficiency of data operation. The calculation of data is defined on the logical structure of the data, but the specific realization of the calculation must be carried out on the storage structure. Generally, there are the following common operations:
(1) Search. Retrieval is to find nodes that meet certain conditions in the data structure. Generally, given the value of a certain field, find the node with the value of the field.
(2) Insert. Add new nodes to the data structure.
(3) Delete. Remove the specified node from the data structure.
(4) Update. Change the value of one or more fields of the specified node.
(5) Sort. Rearrange the nodes in a specified order. For example, increment or decrement.