translation: Add the initial translation of the array and linked list chapter (#1008)

* Add the translation of the data structure chapter. Synchronize the headings in mkdocs-en.yml

* Fix a typo

* Add the translation of the array and linked-list chapter
This commit is contained in:
Yudong Jin 2023-12-27 00:42:55 +08:00 committed by GitHub
parent 42523b8879
commit 19dde675df
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
17 changed files with 1921 additions and 12 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

View File

@ -0,0 +1,210 @@
# Arrays
The "array" is a linear data structure that stores elements of the same type in contiguous memory locations. We refer to the position of an element in the array as its "index". The following image illustrates the main terminology and concepts of an array.
![Array Definition and Storage Method](array.assets/array_definition.png)
## Common Operations on Arrays
### Initializing Arrays
There are two ways to initialize arrays depending on the requirements: without initial values and with given initial values. In cases where initial values are not specified, most programming languages will initialize the array elements to $0$:
=== "Python"
```python title="array.py"
# Initialize array
arr: list[int] = [0] * 5 # [ 0, 0, 0, 0, 0 ]
nums: list[int] = [1, 3, 2, 5, 4]
```
=== "C++"
```cpp title="array.cpp"
/* Initialize array */
// Stored on stack
int arr[5];
int nums[5] = { 1, 3, 2, 5, 4 };
// Stored on heap (manual memory release needed)
int* arr1 = new int[5];
int* nums1 = new int[5] { 1, 3, 2, 5, 4 };
```
=== "Java"
```java title="array.java"
/* Initialize array */
int[] arr = new int[5]; // { 0, 0, 0, 0, 0 }
int[] nums = { 1, 3, 2, 5, 4 };
```
=== "C#"
```csharp title="array.cs"
/* Initialize array */
int[] arr = new int[5]; // { 0, 0, 0, 0, 0 }
int[] nums = [1, 3, 2, 5, 4];
```
=== "Go"
```go title="array.go"
/* Initialize array */
var arr [5]int
// In Go, specifying the length ([5]int) denotes an array, while not specifying it ([]int) denotes a slice.
// Since Go's arrays are designed to have compile-time fixed length, only constants can be used to specify the length.
// For convenience in implementing the extend() method, the Slice will be considered as an Array here.
nums := []int{1, 3, 2, 5, 4}
```
=== "Swift"
```swift title="array.swift"
/* Initialize array */
let arr = Array(repeating: 0, count: 5) // [0, 0, 0, 0, 0]
let nums = [1, 3, 2, 5, 4]
```
=== "JS"
```javascript title="array.js"
/* Initialize array */
var arr = new Array(5).fill(0);
var nums = [1, 3, 2, 5, 4];
```
=== "TS"
```typescript title="array.ts"
/* Initialize array */
let arr: number[] = new Array(5).fill(0);
let nums: number[] = [1, 3, 2, 5, 4];
```
=== "Dart"
```dart title="array.dart"
/* Initialize array */
List<int> arr = List.filled(5, 0); // [0, 0, 0, 0, 0]
List<int> nums = [1, 3, 2, 5, 4];
```
=== "Rust"
```rust title="array.rs"
/* Initialize array */
let arr: Vec<i32> = vec![0; 5]; // [0, 0, 0, 0, 0]
let nums: Vec<i32> = vec![1, 3, 2, 5, 4];
```
=== "C"
```c title="array.c"
/* Initialize array */
int arr[5] = { 0 }; // { 0, 0, 0, 0, 0 }
int nums[5] = { 1, 3, 2, 5, 4 };
```
=== "Zig"
```zig title="array.zig"
// Initialize array
var arr = [_]i32{0} ** 5; // { 0, 0, 0, 0, 0 }
var nums = [_]i32{ 1, 3, 2, 5, 4 };
```
### Accessing Elements
Elements in an array are stored in contiguous memory locations, which makes it easy to compute the memory address of any element. Given the memory address of the array (the address of the first element) and the index of an element, we can calculate the memory address of that element using the formula shown in the following image, allowing direct access to the element.
![Memory Address Calculation for Array Elements](array.assets/array_memory_location_calculation.png)
As observed in the above image, the index of the first element of an array is $0$, which may seem counterintuitive since counting starts from $1$. However, from the perspective of the address calculation formula, **an index is essentially an offset from the memory address**. The offset for the first element's address is $0$, making its index $0$ logical.
Accessing elements in an array is highly efficient, allowing us to randomly access any element in $O(1)$ time.
```src
[file]{array}-[class]{}-[func]{random_access}
```
### Inserting Elements
As shown in the image below, to insert an element in the middle of an array, all elements following the insertion point must be moved one position back to make room for the new element.
![Array Element Insertion Example](array.assets/array_insert_element.png)
It's important to note that since the length of an array is fixed, inserting an element will inevitably lead to the loss of the last element in the array. We will discuss solutions to this problem in the "List" chapter.
```src
[file]{array}-[class]{}-[func]{insert}
```
### Deleting Elements
Similarly, as illustrated below, to delete an element at index $i$, all elements following index $i$ must be moved forward by one position.
![Array Element Deletion Example](array.assets/array_remove_element.png)
Note that after deletion, the last element becomes "meaningless", so we do not need to specifically modify it.
```src
[file]{array}-[class]{}-[func]{remove}
```
Overall, the insertion and deletion operations in arrays have the following disadvantages:
- **High Time Complexity**: Both insertion and deletion in an array have an average time complexity of $O(n)$, where $n$ is the length of the array.
- **Loss of Elements**: Due to the fixed length of arrays, elements that exceed the array's capacity are lost during insertion.
- **Waste of Memory**: We can initialize a longer array and use only the front part, allowing the "lost" end elements during insertion to be "meaningless", but this leads to some wasted memory space.
### Traversing Arrays
In most programming languages, we can traverse an array either by indices or by directly iterating over each element:
```src
[file]{array}-[class]{}-[func]{traverse}
```
### Finding Elements
To find a specific element in an array, we need to iterate through it, checking each element to see if it matches.
Since arrays are linear data structures, this operation is known as "linear search".
```src
[file]{array}-[class]{}-[func]{find}
```
### Expanding Arrays
In complex system environments, it's challenging to ensure that the memory space following an array is available, making it unsafe to extend the array's capacity. Therefore, in most programming languages, **the length of an array is immutable**.
To expand an array, we need to create a larger array and then copy the elements from the original array. This operation has a time complexity of $O(n)$ and can be time-consuming for large arrays. The code is as follows:
```src
[file]{array}-[class]{}-[func]{extend}
```
## Advantages and Limitations of Arrays
Arrays are stored in contiguous memory spaces and consist of elements of the same type. This approach includes a wealth of prior information that the system can use to optimize the operation efficiency of the data structure.
- **High Space Efficiency**: Arrays allocate a contiguous block of memory for data, eliminating the need for additional structural overhead.
- **Support for Random Access**: Arrays allow $O(1)$ time access to any element.
- **Cache Locality**: When accessing array elements, the computer not only loads them but also caches the surrounding data, leveraging high-speed cache to improve the speed of subsequent operations.
However, continuous space storage is a double-edged sword, with the following limitations:
- **Low Efficiency in Insertion and Deletion**: When there are many elements in an array, insertion and deletion operations require moving a large number of elements.
- **Fixed Length**: The length of an array is fixed after initialization. Expanding an array requires copying all data to a new array, which is costly.
- **Space Wastage**: If the allocated size of an array exceeds the actual need, the extra space is wasted.
## Typical Applications of Arrays
Arrays are a fundamental and common data structure, frequently used in various algorithms and in implementing complex data structures.
- **Random Access**: If we want to randomly sample some data, we can use an array for storage and generate a random sequence to implement random sampling based on indices.
- **Sorting and Searching**: Arrays are the most commonly used data structure for sorting and searching algorithms. Quick sort, merge sort, binary search, etc., are primarily conducted on arrays.
- **Lookup Tables**: Arrays can be used as lookup tables for fast element or relationship retrieval. For instance, if we want to implement a mapping from characters to ASCII codes, we can use the ASCII code value of a character as the index, with the corresponding element stored in the corresponding position in the array.
- **Machine Learning**: Arrays are extensively used in neural networks for linear algebra operations between vectors, matrices, and tensors. Arrays are the most commonly used data structure in neural network programming.
- **Data Structure Implementation**: Arrays can be used to implement stacks, queues, hash tables, heaps, graphs, etc. For example, the adjacency matrix representation of a graph is essentially a two-dimensional array.

View File

@ -0,0 +1,9 @@
# Arrays and Linked Lists
![Arrays and Linked Lists](../assets/covers/chapter_array_and_linkedlist.jpg)
!!! abstract
The world of data structures is like a solid brick wall.
The bricks of an array are neatly arranged, each closely connected to the next. In contrast, the bricks of a linked list are scattered, with vines of connections freely weaving through the gaps between bricks.

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

View File

@ -0,0 +1,668 @@
# Linked Lists
Memory space is a common resource for all programs. In a complex system environment, free memory space can be scattered throughout memory. We know that the memory space for storing an array must be contiguous, and when the array is very large, it may not be possible to provide such a large contiguous space. This is where the flexibility advantage of linked lists becomes apparent.
A "linked list" is a linear data structure where each element is a node object, and the nodes are connected via "references". A reference records the memory address of the next node, allowing access to the next node from the current one.
The design of a linked list allows its nodes to be scattered throughout memory, with no need for contiguous memory addresses.
![Linked List Definition and Storage Method](linked_list.assets/linkedlist_definition.png)
Observing the image above, the fundamental unit of a linked list is the "node" object. Each node contains two pieces of data: the "value" of the node and the "reference" to the next node.
- The first node of a linked list is known as the "head node", and the last one is called the "tail node".
- The tail node points to "null", which is represented as $\text{null}$ in Java, $\text{nullptr}$ in C++, and $\text{None}$ in Python.
- In languages that support pointers, like C, C++, Go, and Rust, the aforementioned "reference" should be replaced with a "pointer".
As shown in the following code, a linked list node `ListNode`, apart from containing a value, also needs to store a reference (pointer). Therefore, **a linked list consumes more memory space than an array for the same amount of data**.
=== "Python"
```python title=""
class ListNode:
"""Linked List Node Class"""
def __init__(self, val: int):
self.val: int = val # Node value
self.next: ListNode | None = None # Reference to the next node
```
=== "C++"
```cpp title=""
/* Linked List Node Structure */
struct ListNode {
int val; // Node value
ListNode *next; // Pointer to the next node
ListNode(int x) : val(x), next(nullptr) {} // Constructor
};
```
=== "Java"
```java title=""
/* Linked List Node Class */
class ListNode {
int val; // Node value
ListNode next; // Reference to the next node
ListNode(int x) { val = x; } // Constructor
}
```
=== "C#"
```csharp title=""
/* Linked List Node Class */
class ListNode(int x) { // Constructor
int val = x; // Node value
ListNode? next; // Reference to the next node
}
```
=== "Go"
```go title=""
/* Linked List Node Structure */
type ListNode struct {
Val int // Node value
Next *ListNode // Pointer to the next node
}
// NewListNode Constructor, creates a new linked list
func NewListNode(val int) *ListNode {
return &ListNode{
Val: val,
Next: nil,
}
}
```
=== "Swift"
```swift title=""
/* Linked List Node Class */
class ListNode {
var val: Int // Node value
var next: ListNode? // Reference to the next node
init(x: Int) { // Constructor
val = x
}
}
```
=== "JS"
```javascript title=""
/* Linked List Node Class */
class ListNode {
constructor(val, next) {
this.val = (val === undefined ? 0 : val); // Node value
this.next = (next === undefined ? null : next); // Reference to the next node
}
}
```
=== "TS"
```typescript title=""
/* Linked List Node Class */
class ListNode {
val: number;
next: ListNode | null;
constructor(val?: number, next?: ListNode | null) {
this.val = val === undefined ? 0 : val; // Node value
this.next = next === undefined ? null : next; // Reference to the next node
}
}
```
=== "Dart"
```dart title=""
/* 链表节点类 */
class ListNode {
int val; // Node value
ListNode? next; // Reference to the next node
ListNode(this.val, [this.next]); // Constructor
}
```
=== "Rust"
```rust title=""
use std::rc::Rc;
use std::cell::RefCell;
/* Linked List Node Class */
#[derive(Debug)]
struct ListNode {
val: i32, // Node value
next: Option<Rc<RefCell<ListNode>>>, // Pointer to the next node
}
```
=== "C"
```c title=""
/* Linked List Node Structure */
typedef struct ListNode {
int val; // Node value
struct ListNode *next; // Pointer to the next node
} ListNode;
/* Constructor */
ListNode *newListNode(int val) {
ListNode *node;
node = (ListNode *) malloc(sizeof(ListNode));
node->val = val;
node->next = NULL;
return node;
}
```
=== "Zig"
```zig title=""
// Linked List Node Class
pub fn ListNode(comptime T: type) type {
return struct {
const Self = @This();
val: T = 0, // Node value
next: ?*Self = null, // Pointer to the next node
// Constructor
pub fn init(self: *Self, x: i32) void {
self.val = x;
self.next = null;
}
};
}
```
## Common Operations on Linked Lists
### Initializing a Linked List
Building a linked list involves two steps: initializing each node object and then establishing the references between nodes. Once initialized, we can access all nodes sequentially from the head node via the `next` reference.
=== "Python"
```python title="linked_list.py"
# Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4
# Initialize each node
n0 = ListNode(1)
n1 = ListNode(3)
n2 = ListNode(2)
n3 = ListNode(5)
n4 = ListNode(4)
# Build references between nodes
n0.next = n1
n1.next = n2
n2.next = n3
n3.next = n4
```
=== "C++"
```cpp title="linked_list.cpp"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
ListNode* n0 = new ListNode(1);
ListNode* n1 = new ListNode(3);
ListNode* n2 = new ListNode(2);
ListNode* n3 = new ListNode(5);
ListNode* n4 = new ListNode(4);
// Build references between nodes
n0->next = n1;
n1->next = n2;
n2->next = n3;
n3->next = n4;
```
=== "Java"
```java title="linked_list.java"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
ListNode n0 = new ListNode(1);
ListNode n1 = new ListNode(3);
ListNode n2 = new ListNode(2);
ListNode n3 = new ListNode(5);
ListNode n4 = new ListNode(4);
// Build references between nodes
n0.next = n1;
n1.next = n2;
n2.next = n3;
n3.next = n4;
```
=== "C#"
```csharp title="linked_list.cs"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
ListNode n0 = new(1);
ListNode n1 = new(3);
ListNode n2 = new(2);
ListNode n3 = new(5);
ListNode n4 = new(4);
// Build references between nodes
n0.next = n1;
n1.next = n2;
n2.next = n3;
n3.next = n4;
```
=== "Go"
```go title="linked_list.go"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
n0 := NewListNode(1)
n1 := NewListNode(3)
n2 := NewListNode(2)
n3 := NewListNode(5)
n4 := NewListNode(4)
// Build references between nodes
n0.Next = n1
n1.Next = n2
n2.Next = n3
n3.Next = n4
```
=== "Swift"
```swift title="linked_list.swift"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
let n0 = ListNode(x: 1)
let n1 = ListNode(x: 3)
let n2 = ListNode(x: 2)
let n3 = ListNode(x: 5)
let n4 = ListNode(x: 4)
// Build references between nodes
n0.next = n1
n1.next = n2
n2.next = n3
n3.next = n4
```
=== "JS"
```javascript title="linked_list.js"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
const n0 = new ListNode(1);
const n1 = new ListNode(3);
const n2 = new ListNode(2);
const n3 = new ListNode(5);
const n4 = new ListNode(4);
// Build references between nodes
n0.next = n1;
n1.next = n2;
n2.next = n3;
n3.next = n4;
```
=== "TS"
```typescript title="linked_list.ts"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
const n0 = new ListNode(1);
const n1 = new ListNode(3);
const n2 = new ListNode(2);
const n3 = new ListNode(5);
const n4 = new ListNode(4);
// Build references between nodes
n0.next = n1;
n1.next = n2;
n2.next = n3;
n3.next = n4;
```
=== "Dart"
```dart title="linked_list.dart"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
ListNode n0 = ListNode(1);
ListNode n1 = ListNode(3);
ListNode n2 = ListNode(2);
ListNode n3 = ListNode(5);
ListNode n4 = ListNode(4);
// Build references between nodes
n0.next = n1;
n1.next = n2;
n2.next = n3;
n3.next = n4;
```
=== "Rust"
```rust title="linked_list.rs"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
let n0 = Rc::new(RefCell::new(ListNode { val: 1, next: None }));
let n1 = Rc::new(RefCell::new(ListNode { val: 3, next: None }));
let n2 = Rc::new(RefCell::new(ListNode { val: 2, next: None }));
let n3 = Rc::new(RefCell::new(ListNode { val: 5, next: None }));
let n4 = Rc::new(RefCell::new(ListNode { val: 4, next: None }));
// Build references between nodes
n0.borrow_mut().next = Some(n1.clone());
n1.borrow_mut().next = Some(n2.clone());
n2.borrow_mut().next = Some(n3.clone());
n3.borrow_mut().next = Some(n4.clone());
```
=== "C"
```c title="linked_list.c"
/* Initialize linked list: 1 -> 3 -> 2 -> 5 -> 4 */
// Initialize each node
ListNode* n0 = newListNode(1);
ListNode* n1 = newListNode(3);
ListNode* n2 = newListNode(2);
ListNode* n3 = newListNode(5);
ListNode* n4 = newListNode(4);
// Build references between nodes
n0->next = n1;
n1->next = n2;
n2->next = n3;
n3->next = n4;
```
=== "Zig"
```zig title="linked_list.zig"
// Initialize linked list
// Initialize each node
var n0 = inc.ListNode(i32){.val = 1};
var n1 = inc.ListNode(i32){.val = 3};
var n2 = inc.ListNode(i32){.val = 2};
var n3 = inc.ListNode(i32){.val = 5};
var n4 = inc.ListNode(i32){.val = 4};
// Build references between nodes
n0.next = &n1;
n1.next = &n2;
n2.next = &n3;
n3.next = &n4;
```
An array is a single variable, such as the array `nums` containing elements `nums[0]`, `nums[1]`, etc., while a linked list is composed of multiple independent node objects. **We usually refer to the linked list by its head node**, as in the linked list `n0` in the above code.
### Inserting a Node
Inserting a node in a linked list is very easy. As shown in the image below, suppose we want to insert a new node `P` between two adjacent nodes `n0` and `n1`. **This requires changing only two node references (pointers)**, with a time complexity of $O(1)$.
In contrast, the time complexity of inserting an element in an array is $O(n)$, which is less efficient with large data volumes.
![Linked List Node Insertion Example](linked_list.assets/linkedlist_insert_node.png)
```src
[file]{linked_list}-[class]{}-[func]{insert}
```
### Deleting a Node
As shown below, deleting a node in a linked list is also very convenient, **requiring only the change of one node's reference (pointer)**.
Note that although node `P` still points to `n1` after the deletion operation is completed, it is no longer accessible when traversing the list, meaning `P` is no longer part of the list.
![Linked List Node Deletion](linked_list.assets/linkedlist_remove_node.png)
```src
[file]{linked_list}-[class]{}-[func]{remove}
```
### Accessing Nodes
**Accessing nodes in a linked list is less efficient**. As mentioned earlier, any element in an array can be accessed in $O(1)$ time. However, in a linked list, the program needs to start from the head node and traverse each node sequentially until it finds the target node. That is, accessing the $i$-th node of a linked list requires $i - 1$ iterations, with a time complexity of $O(n)$.
```src
[file]{linked_list}-[class]{}-[func]{access}
```
### Finding Nodes
Traverse the linked list to find a node with a value equal to `target`, and output the index of that node in the linked list. This process also falls under linear search. The code is as follows:
```src
[file]{linked_list}-[class]{}-[func]{find}
```
## Arrays vs. Linked Lists
The following table summarizes the characteristics of arrays and linked lists and compares their operational efficiencies. Since they employ two opposite storage strategies, their properties and operational efficiencies also show contrasting features.
<p align="center"> Table <id> &nbsp; Efficiency Comparison of Arrays and Linked Lists </p>
| | Arrays | Linked Lists |
| ------------------ | ------------------------------------------------ | ----------------------- |
| Storage | Contiguous Memory Space | Dispersed Memory Space |
| Capacity Expansion | Fixed Length | Flexible Expansion |
| Memory Efficiency | Less Memory per Element, Potential Space Wastage | More Memory per Element |
| Accessing Elements | $O(1)$ | $O(n)$ |
| Adding Elements | $O(n)$ | $O(1)$ |
| Deleting Elements | $O(n)$ | $O(1)$ |
## Common Types of Linked Lists
As shown in the following image, there are three common types of linked lists.
- **Singly Linked List**: This is the regular linked list introduced earlier. The nodes of a singly linked list contain the value and a reference to the next node. The first node is called the head node, and the last node, pointing to null (`None`), is the tail node.
- **Circular Linked List**: If the tail node of a singly linked list points back to the head node (forming a loop), it becomes a circular linked list. In a circular linked list, any node can be considered the head node.
- **Doubly Linked List**: Compared to a singly linked list, a doubly linked list stores references in two directions. Its nodes contain references to both the next (successor) and the previous (predecessor) nodes. Doubly linked lists are more flexible as they allow traversal in both directions but require more memory space.
=== "Python"
```python title=""
class ListNode:
"""Bidirectional linked list node class""""
def __init__(self, val: int):
self.val: int = val # Node value
self.next: ListNode | None = None # Reference to the successor node
self.prev: ListNode | None = None # Reference to a predecessor node
```
=== "C++"
```cpp title=""
/* Bidirectional linked list node structure */
struct ListNode {
int val; // Node value
ListNode *next; // Pointer to the successor node
ListNode *prev; // Pointer to the predecessor node
ListNode(int x) : val(x), next(nullptr), prev(nullptr) {} // Constructor
};
```
=== "Java"
```java title=""
/* Bidirectional linked list node class */
class ListNode {
int val; // Node value
ListNode next; // Reference to the next node
ListNode prev; // Reference to the predecessor node
ListNode(int x) { val = x; } // Constructor
}
```
=== "C#"
```csharp title=""
/* Bidirectional linked list node class */
class ListNode(int x) { // Constructor
int val = x; // Node value
ListNode next; // Reference to the next node
ListNode prev; // Reference to the predecessor node
}
```
=== "Go"
```go title=""
/* Bidirectional linked list node structure */
type DoublyListNode struct {
Val int // Node value
Next *DoublyListNode // Pointer to the successor node
Prev *DoublyListNode // Pointer to the predecessor node
}
// NewDoublyListNode initialization
func NewDoublyListNode(val int) *DoublyListNode {
return &DoublyListNode{
Val: val,
Next: nil,
Prev: nil,
}
}
```
=== "Swift"
```swift title=""
/* Bidirectional linked list node class */
class ListNode {
var val: Int // Node value
var next: ListNode? // Reference to the next node
var prev: ListNode? // Reference to the predecessor node
init(x: Int) { // Constructor
val = x
}
}
```
=== "JS"
```javascript title=""
/* Bidirectional linked list node class */
class ListNode {
constructor(val, next, prev) {
this.val = val === undefined ? 0 : val; // Node value
this.next = next === undefined ? null : next; // Reference to the successor node
this.prev = prev === undefined ? null : prev; // Reference to the predecessor node
}
}
```
=== "TS"
```typescript title=""
/* Bidirectional linked list node class */
class ListNode {
val: number;
next: ListNode | null;
prev: ListNode | null;
constructor(val?: number, next?: ListNode | null, prev?: ListNode | null) {
this.val = val === undefined ? 0 : val; // Node value
this.next = next === undefined ? null : next; // Reference to the successor node
this.prev = prev === undefined ? null : prev; // Reference to the predecessor node
}
}
```
=== "Dart"
```dart title=""
/* Bidirectional linked list node class */
class ListNode {
int val; // Node value
ListNode next; // Reference to the next node
ListNode prev; // Reference to the predecessor node
ListNode(this.val, [this.next, this.prev]); // Constructor
}
```
=== "Rust"
```rust title=""
use std::rc::Rc;
use std::cell::RefCell;
/* Bidirectional linked list node type */
#[derive(Debug)]
struct ListNode {
val: i32, // Node value
next: Option<Rc<RefCell<ListNode>>>, // Pointer to successor node
prev: Option<Rc<RefCell<ListNode>>>, // Pointer to predecessor node
}
/* Constructors */
impl ListNode {
fn new(val: i32) -> Self {
ListNode {
val,
next: None,
prev: None,
}
}
}
```
=== "C"
```c title=""
/* Bidirectional linked list node structure */
typedef struct ListNode {
int val; // Node value
struct ListNode *next; // Pointer to the successor node
struct ListNode *prev; // Pointer to the predecessor node
} ListNode;
/* Constructors */
ListNode *newListNode(int val) {
ListNode *node, *next;
node = (ListNode *) malloc(sizeof(ListNode));
node->val = val;
node->next = NULL;
node->prev = NULL;
return node;
}
```
=== "Zig"
```zig title=""
// Bidirectional linked list node class
pub fn ListNode(comptime T: type) type {
return struct {
const Self = @This();
val: T = 0, // Node value
next: ?*Self = null, // Pointer to the successor node
prev: ?*Self = null, // Pointer to the predecessor node
// Constructor
pub fn init(self: *Self, x: i32) void {
self.val = x;
self.next = null;
self.prev = null;
}
};
}
```
![Common Types of Linked Lists](linked_list.assets/linkedlist_common_types.png)
## Typical Applications of Linked Lists
Singly linked lists are commonly used to implement stacks, queues, hash tables, and graphs.
- **Stacks and Queues**: When insertion and deletion operations are performed at one end of the linked list, it exhibits last-in-first-out characteristics, corresponding to a stack. When insertion is at one end and deletion is at the other, it shows first-in-first-out characteristics, corresponding to a queue.
- **Hash Tables**: Chaining is one of the mainstream solutions to hash collisions, where all colliding elements are placed in a linked list.
- **Graphs**: Adjacency lists are a common way to represent graphs, where each vertex is associated with a linked list. Each element in the list represents other vertices connected to that vertex.
Doubly linked lists are commonly used in scenarios that require quick access to the previous and next elements.
- **Advanced Data Structures**: For example, in red-black trees and B-trees, we need to access a node's parent, which can be achieved by storing a reference to the parent node in each node, similar to a doubly linked list.
- **Browser History**: In web browsers, when a user clicks the forward or backward button, the browser needs to know the previously and next visited web pages. The properties of a doubly linked list make this operation simple.
- **LRU Algorithm**: In Least Recently Used (LRU) cache eviction algorithms, we need to quickly find the least recently used data and support rapid addition and deletion of nodes. Here, using a doubly linked list is very appropriate.
Circular linked lists are commonly used in scenarios requiring periodic operations, such as resource scheduling in operating systems.
- **Round-Robin Scheduling Algorithm**: In operating systems, the round-robin scheduling algorithm is a common CPU scheduling algorithm that cycles through a group of processes. Each process is assigned a time slice, and when it expires, the CPU switches to the next process. This circular operation can be implemented using a circular linked list.
- **Data Buffers**: Circular linked lists may also be used in some data buffer implementations. For instance, in audio and video players, the data stream might be divided into multiple buffer blocks placed in a circular linked list to achieve seamless playback.

View File

@ -0,0 +1,870 @@
# List
A "list" is an abstract data structure concept, representing an ordered collection of elements. It supports operations like element access, modification, addition, deletion, and traversal, without requiring users to consider capacity limitations. Lists can be implemented based on linked lists or arrays.
- A linked list naturally functions as a list, supporting operations for adding, deleting, searching, and modifying elements, and can dynamically adjust its size.
- Arrays also support these operations, but due to their fixed length, they can be considered as a list with a length limit.
When using arrays to implement lists, **the fixed length property reduces the practicality of the list**. This is because we often cannot determine in advance how much data needs to be stored, making it difficult to choose an appropriate list length. If the length is too small, it may not meet the requirements; if too large, it may waste memory space.
To solve this problem, we can use a "dynamic array" to implement lists. It inherits the advantages of arrays and can dynamically expand during program execution.
In fact, **many programming languages' standard libraries implement lists using dynamic arrays**, such as Python's `list`, Java's `ArrayList`, C++'s `vector`, and C#'s `List`. In the following discussion, we will consider "list" and "dynamic array" as synonymous concepts.
## Common List Operations
### Initializing a List
We typically use two methods of initialization: "without initial values" and "with initial values".
=== "Python"
```python title="list.py"
# Initialize list
# Without initial values
nums1: list[int] = []
# With initial values
nums: list[int] = [1, 3, 2, 5, 4]
```
=== "C++"
```cpp title="list.cpp"
/* Initialize list */
// Note, in C++ the vector is the equivalent of nums described here
// Without initial values
vector<int> nums1;
// With initial values
vector<int> nums = { 1, 3, 2, 5, 4 };
```
=== "Java"
```java title="list.java"
/* Initialize list */
// Without initial values
List<Integer> nums1 = new ArrayList<>();
// With initial values (note the element type should be the wrapper class Integer[] for int[])
Integer[] numbers = new Integer[] { 1, 3, 2, 5, 4 };
List<Integer> nums = new ArrayList<>(Arrays.asList(numbers));
```
=== "C#"
```csharp title="list.cs"
/* Initialize list */
// Without initial values
List<int> nums1 = [];
// With initial values
int[] numbers = [1, 3, 2, 5, 4];
List<int> nums = [.. numbers];
```
=== "Go"
```go title="list_test.go"
/* Initialize list */
// Without initial values
nums1 := []int{}
// With initial values
nums := []int{1, 3, 2, 5, 4}
```
=== "Swift"
```swift title="list.swift"
/* Initialize list */
// Without initial values
let nums1: [Int] = []
// With initial values
var nums = [1, 3, 2, 5, 4]
```
=== "JS"
```javascript title="list.js"
/* Initialize list */
// Without initial values
const nums1 = [];
// With initial values
const nums = [1, 3, 2, 5, 4];
```
=== "TS"
```typescript title="list.ts"
/* Initialize list */
// Without initial values
const nums1: number[] = [];
// With initial values
const nums: number[] = [1, 3, 2, 5, 4];
```
=== "Dart"
```dart title="list.dart"
/* Initialize list */
// Without initial values
List<int> nums1 = [];
// With initial values
List<int> nums = [1, 3, 2, 5, 4];
```
=== "Rust"
```rust title="list.rs"
/* Initialize list */
// Without initial values
let nums1: Vec<i32> = Vec::new();
// With initial values
let nums: Vec<i32> = vec![1, 3, 2, 5, 4];
```
=== "C"
```c title="list.c"
// C does not provide built-in dynamic arrays
```
=== "Zig"
```zig title="list.zig"
// Initialize list
var nums = std.ArrayList(i32).init(std.heap.page_allocator);
defer nums.deinit();
try nums.appendSlice(&[_]i32{ 1, 3, 2, 5, 4 });
```
### Accessing Elements
Lists are essentially arrays, so accessing and updating elements can be done in $O(1)$ time, which is very efficient.
=== "Python"
```python title="list.py"
# Access elements
num: int = nums[1] # Access the element at index 1
# Update elements
nums[1] = 0 # Update the element at index 1 to 0
```
=== "C++"
```cpp title="list.cpp"
/* Access elements */
int num = nums[1]; // Access the element at index 1
/* Update elements */
nums[1] = 0; // Update the element at index 1 to 0
```
=== "Java"
```java title="list.java"
/* Access elements */
int num = nums.get(1); // Access the element at index 1
/* Update elements */
nums.set(1, 0); // Update the element at index 1 to 0
```
=== "C#"
```csharp title="list.cs"
/* Access elements */
int num = nums[1]; // Access the element at index 1
/* Update elements */
nums[1] = 0; // Update the element at index 1 to 0
```
=== "Go"
```go title="list_test.go"
/* Access elements */
num := nums[1] // Access the element at index 1
/* Update elements */
nums[1] = 0 // Update the element at index 1 to 0
```
=== "Swift"
```swift title="list.swift"
/* Access elements */
let num = nums[1] // Access the element at index 1
/* Update elements */
nums[1] = 0 // Update the element at index 1 to 0
```
=== "JS"
```javascript title="list.js"
/* Access elements */
const num = nums[1]; // Access the element at index 1
/* Update elements */
nums[1] = 0; // Update the element at index 1 to 0
```
=== "TS"
```typescript title="list.ts"
/* Access elements */
const num: number = nums[1]; // Access the element at index 1
/* Update elements */
nums[1] = 0; // Update the element at index 1 to 0
```
=== "Dart"
```dart title="list.dart"
/* Access elements */
int num = nums[1]; // Access the element at index 1
/* Update elements */
nums[1] = 0; // Update the element at index 1 to 0
```
=== "Rust"
```rust title="list.rs"
/* Access elements */
let num: i32 = nums[1]; // Access the element at index 1
/* Update elements */
nums[1] = 0; // Update the element at index 1 to 0
```
=== "C"
```c title="list.c"
// C does not provide built-in dynamic arrays
```
=== "Zig"
```zig title="list.zig"
// Access elements
var num = nums.items[1]; // Access the element at index 1
// Update elements
nums.items[1] = 0; // Update the element at index 1 to 0
```
### Inserting and Deleting Elements
Compared to arrays, lists can freely add and remove elements. Adding elements at the end of the list has a time complexity of $O(1)$, but the efficiency of inserting and deleting elements is still the same as in arrays, with a time complexity of $O(n)$.
=== "Python"
```python title="list.py"
# Clear list
nums.clear()
# Append elements at the end
nums.append(1)
nums.append(3)
nums.append(2)
nums.append(5)
nums.append(4)
# Insert element in the middle
nums.insert(3, 6) # Insert number 6 at index 3
# Remove elements
nums.pop(3) # Remove the element at index 3
```
=== "C++"
```cpp title="list.cpp"
/* Clear list */
nums.clear();
/* Append elements at the end */
nums.push_back(1);
nums.push_back(3);
nums.push_back(2);
nums.push_back(5);
nums.push_back(4);
/* Insert element in the middle */
nums.insert(nums.begin() + 3, 6); // Insert number 6 at index 3
/* Remove elements */
nums.erase(nums.begin() + 3); // Remove the element at index 3
```
=== "Java"
```java title="list.java"
/* Clear list */
nums.clear();
/* Append elements at the end */
nums.add(1);
nums.add(3);
nums.add(2);
nums.add(5);
nums.add(4);
/* Insert element in the middle */
nums.add(3, 6); // Insert number 6 at index 3
/* Remove elements */
nums.remove(3); // Remove the element at index 3
```
=== "C#"
```csharp title="list.cs"
/* Clear list */
nums.Clear();
/* Append elements at the end */
nums.Add(1);
nums.Add(3);
nums.Add(2);
nums.Add(5);
nums.Add(4);
/* Insert element in the middle */
nums.Insert(3, 6);
/* Remove elements */
nums.RemoveAt(3);
```
=== "Go"
```go title="list_test.go"
/* Clear list */
nums = nil
/* Append elements at the end */
nums = append(nums, 1)
nums = append(nums, 3)
nums = append(nums, 2)
nums = append(nums, 5)
nums = append(nums, 4)
/* Insert element in the middle */
nums = append(nums[:3], append([]int{6}, nums[3:]...)...) // Insert number 6 at index 3
/* Remove elements */
nums = append(nums[:3], nums[4:]...) // Remove the element at index 3
```
=== "Swift"
```swift title="list.swift"
/* Clear list */
nums.removeAll()
/* Append elements at the end */
nums.append(1)
nums.append(3)
nums.append(2)
nums.append(5)
nums.append(4)
/* Insert element in the middle */
nums.insert(6, at: 3) // Insert number 6 at index 3
/* Remove elements */
nums.remove(at: 3) // Remove the element at index 3
```
=== "JS"
```javascript title="list.js"
/* Clear list */
nums.length = 0;
/* Append elements at the end */
nums.push(1);
nums.push(3);
nums.push(2);
nums.push(5);
nums.push(4);
/* Insert element in the middle */
nums.splice(3, 0, 6);
/* Remove elements */
nums.splice(3, 1);
```
=== "TS"
```typescript title="list.ts"
/* Clear list */
nums.length = 0;
/* Append elements at the end */
nums.push(1);
nums.push(3);
nums.push(2);
nums.push(5);
nums.push(4);
/* Insert element in the middle */
nums.splice(3, 0, 6);
/* Remove elements */
nums.splice(3, 1);
```
=== "Dart"
```dart title="list.dart"
/* Clear list */
nums.clear();
/* Append elements at the end */
nums.add(1);
nums.add(3);
nums.add(2);
nums.add(5);
nums.add(4);
/* Insert element in the middle */
nums.insert(3, 6); // Insert number 6 at index 3
/* Remove elements */
nums.removeAt(3); // Remove the element at index 3
```
=== "Rust"
```rust title="list.rs"
/* Clear list */
nums.clear();
/* Append elements at the end */
nums.push(1);
nums.push(3);
nums.push(2);
nums.push(5);
nums.push(4);
/* Insert element in the middle */
nums.insert(3, 6); // Insert number 6 at index 3
/* Remove elements */
nums.remove(3); // Remove the element at index 3
```
=== "C"
```c title="list.c"
// C does not provide built-in dynamic arrays
```
=== "Zig"
```zig title="list.zig"
// Clear list
nums.clearRetainingCapacity();
// Append elements at the end
try nums.append(1);
try nums.append(3);
try nums.append(2);
try nums.append(5);
try nums.append(4);
// Insert element in the middle
try nums.insert(3, 6); // Insert number 6 at index 3
// Remove elements
_ = nums.orderedRemove(3); // Remove the element at index 3
```
### Traversing the List
Like arrays, lists can be traversed based on index, or by directly iterating over each element.
=== "Python"
```python title="list.py"
# Iterate through the list by index
count = 0
for i in range(len(nums)):
count += nums[i]
# Iterate directly through list elements
for num in nums:
count += num
```
=== "C++"
```cpp title="list.cpp"
/* Iterate through the list by index */
int count = 0;
for (int i = 0; i < nums.size(); i++) {
count += nums[i];
}
/* Iterate directly through list elements */
count = 0;
for (int num : nums) {
count += num;
}
```
=== "Java"
```java title="list.java"
/* Iterate through the list by index */
int count = 0;
for (int i = 0; i < nums.size(); i++) {
count += nums.get(i);
}
/* Iterate directly through list elements */
for (int num : nums) {
count += num;
}
```
=== "C#"
```csharp title="list.cs"
/* Iterate through the list by index */
int count = 0;
for (int i = 0; i < nums.Count; i++) {
count += nums[i];
}
/* Iterate directly through list elements */
count = 0;
foreach (int num in nums) {
count += num;
}
```
=== "Go"
```go title="list_test.go"
/* Iterate through the list by index */
count := 0
for i := 0; i < len(nums); i++ {
count += nums[i]
}
/* Iterate directly through list elements */
count = 0
for _, num := range nums {
count += num
}
```
=== "Swift"
```swift title="list.swift"
/* Iterate through the list by index */
var count = 0
for i in nums.indices {
count += nums[i]
}
/* Iterate directly through list elements */
count = 0
for num in nums {
count += num
}
```
=== "JS"
```javascript title="list.js"
/* Iterate through the list by index */
let count = 0;
for (let i = 0; i < nums.length; i++) {
count += nums[i];
}
/* Iterate directly through list elements */
count = 0;
for (const num of nums) {
count += num;
}
```
=== "TS"
```typescript title="list.ts"
/* Iterate through the list by index */
let count = 0;
for (let i = 0; i < nums.length; i++) {
count += nums[i];
}
/* Iterate directly through list elements */
count = 0;
for (const num of nums) {
count += num;
}
```
=== "Dart"
```dart title="list.dart"
/* Iterate through the list by index */
int count = 0;
for (var i = 0; i < nums.length; i++) {
count += nums[i];
}
/* Iterate directly through list elements */
count = 0;
for (var num in nums) {
count += num;
}
```
=== "Rust"
```rust title="list.rs"
// Iterate through the list by index
let mut _count = 0;
for i in 0..nums.len() {
_count += nums[i];
}
// Iterate directly through list elements
_count = 0;
for num in &nums {
_count += num;
}
```
=== "C"
```c title="list.c"
// C does not provide built-in dynamic arrays
```
=== "Zig"
```zig title="list.zig"
// Iterate through the list by index
var count: i32 = 0;
var i: i32 = 0;
while (i < nums.items.len) : (i += 1) {
count += nums[i];
}
// Iterate directly through list elements
count = 0;
for (nums.items) |num| {
count += num;
}
```
### Concatenating Lists
Given a new list `nums1`, we can append it to the end of the original list.
=== "Python"
```python title="list.py"
# Concatenate two lists
nums1: list[int] = [6, 8, 7, 10, 9]
nums += nums1 # Concatenate nums1 to the end of nums
```
=== "C++"
```cpp title="list.cpp"
/* Concatenate two lists */
vector<int> nums1 = { 6, 8, 7, 10, 9 };
// Concatenate nums1 to the end of nums
nums.insert(nums.end(), nums1.begin(), nums1.end());
```
=== "Java"
```java title="list.java"
/* Concatenate two lists */
List<Integer> nums1 = new ArrayList<>(Arrays.asList(new Integer[] { 6, 8, 7, 10, 9 }));
nums.addAll(nums1); // Concatenate nums1 to the end of nums
```
=== "C#"
```csharp title="list.cs"
/* Concatenate two lists */
List<int> nums1 = [6, 8, 7, 10, 9];
nums.AddRange(nums1); // Concatenate nums1 to the end of nums
```
=== "Go"
```go title="list_test.go"
/* Concatenate two lists */
nums1 := []int{6, 8, 7, 10, 9}
nums = append(nums, nums1...) // Concatenate nums1 to the end of nums
```
=== "Swift"
```swift title="list.swift"
/* Concatenate two lists */
let nums1 = [6, 8, 7, 10, 9]
nums.append(contentsOf: nums1) // Concatenate nums1 to the end of nums
```
=== "JS"
```javascript title="list.js"
/* Concatenate two lists */
const nums1 = [6, 8, 7, 10, 9];
nums.push(...nums1); // Concatenate nums1 to the end of nums
```
=== "TS"
```typescript title="list.ts"
/* Concatenate two lists */
const nums1: number[] = [6, 8, 7, 10, 9];
nums.push(...nums1); // Concatenate nums1 to the end of nums
```
=== "Dart"
```dart title="list.dart"
/* Concatenate two lists */
List<int> nums1 = [6, 8, 7, 10, 9];
nums.addAll(nums1); // Concatenate nums1 to the end of nums
```
=== "Rust"
```rust title="list.rs"
/* Concatenate two lists */
let nums1: Vec<i32> = vec![6, 8, 7, 10, 9];
nums.extend(nums1);
```
=== "C"
```c title="list.c"
// C does not provide built-in dynamic arrays
```
=== "Zig"
```zig title="list.zig"
// Concatenate two lists
var nums1 = std.ArrayList(i32).init(std.heap.page_allocator);
defer nums1.deinit();
try nums1.appendSlice(&[_]i32{ 6, 8, 7, 10, 9 });
try nums.insertSlice(nums.items.len, nums1.items); // Concatenate nums1 to the end of nums
```
### Sorting the List
After sorting the list, we can use algorithms often tested in array-related algorithm problems, such as "binary search" and "two-pointer" algorithms.
=== "Python"
```python title="list.py"
# Sort the list
nums.sort() # After sorting, the list elements are in ascending order
```
=== "C++"
```cpp title="list.cpp"
/* Sort the list */
sort(nums.begin(), nums.end()); // After sorting, the list elements are in ascending order
```
=== "Java"
```java title="list.java"
/* Sort the list */
Collections.sort(nums); // After sorting, the list elements are in ascending order
```
=== "C#"
```csharp title="list.cs"
/* Sort the list */
nums.Sort(); // After sorting, the list elements are in ascending order
```
=== "Go"
```go title="list_test.go"
/* Sort the list */
sort.Ints(nums) // After sorting, the list elements are in ascending order
```
=== "Swift"
```swift title="list.swift"
/* Sort the list */
nums.sort() // After sorting, the list elements are in ascending order
```
=== "JS"
```javascript title="list.js"
/* Sort the list */
nums.sort((a, b) => a - b); // After sorting, the list elements are in ascending order
```
=== "TS"
```typescript title="list.ts"
/* Sort the list */
nums.sort((a, b) => a - b); // After sorting, the list elements are in ascending order
```
=== "Dart"
```dart title="list.dart"
/* Sort the list */
nums.sort(); // After sorting, the list elements are in ascending order
```
=== "Rust"
```rust title="list.rs"
/* Sort the list */
nums.sort(); // After sorting, the list elements are in ascending order
```
=== "C"
```c title="list.c"
// C does not provide built-in dynamic arrays
```
=== "Zig"
```zig title="list.zig"
// Sort the list
std.sort.sort(i32, nums.items, {}, comptime std.sort.asc(i32));
```
## List Implementation
Many programming languages have built-in lists, such as Java, C++, Python, etc. Their implementations are quite complex, with very meticulous settings for parameters such as initial capacity and expansion multiplier. Interested readers can refer to the source code for learning.
To deepen the understanding of how lists work, let's try implementing a simple version of a list, focusing on three key designs.
- **Initial Capacity**: Choose a reasonable initial capacity for the array. In this example, we choose 10 as the initial capacity.
- **Size Recording**: Declare a variable `size` to record the current number of elements in the list, updating in real-time with element insertion and deletion. With this variable, we can locate the end of the list and determine whether expansion is needed.
- **Expansion Mechanism**: If the list's capacity is full when inserting an element, expansion is necessary. First, create a larger array based on the expansion multiplier, then move all elements of the current array to the new array. In this example, we define that each time the array will expand to twice its previous size.
```src
[file]{my_list}-[class]{my_list}-[func]{}
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

View File

@ -0,0 +1,71 @@
# Memory and Cache *
In the first two sections of this chapter, we explored arrays and linked lists, two fundamental and important data structures, representing "continuous storage" and "dispersed storage" respectively.
In fact, **the physical structure largely determines the efficiency of a program's use of memory and cache**, which in turn affects the overall performance of the algorithm.
## Computer Storage Devices
There are three types of storage devices in computers: "hard disk," "random-access memory (RAM)," and "cache memory." The following table shows their different roles and performance characteristics in computer systems.
<p align="center"> Table <id> &nbsp; Computer Storage Devices </p>
| | Hard Disk | Memory | Cache |
| ---------- | -------------------------------------------------------------- | ------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------- |
| Usage | Long-term storage of data, including OS, programs, files, etc. | Temporary storage of currently running programs and data being processed | Stores frequently accessed data and instructions, reducing the number of CPU accesses to memory |
| Volatility | Data is not lost after power off | Data is lost after power off | Data is lost after power off |
| Capacity | Larger, TB level | Smaller, GB level | Very small, MB level |
| Speed | Slower, several hundred to thousands MB/s | Faster, several tens of GB/s | Very fast, several tens to hundreds of GB/s |
| Price | Cheaper, several cents to yuan / GB | More expensive, tens to hundreds of yuan / GB | Very expensive, priced with CPU |
We can imagine the computer storage system as a pyramid structure shown in the figure below. The storage devices closer to the top of the pyramid are faster, have smaller capacity, and are more costly. This multi-level design is not accidental, but the result of careful consideration by computer scientists and engineers.
- **Hard disks are difficult to replace with memory**. Firstly, data in memory is lost after power off, making it unsuitable for long-term data storage; secondly, the cost of memory is dozens of times that of hard disks, making it difficult to popularize in the consumer market.
- **It is difficult for caches to have both large capacity and high speed**. As the capacity of L1, L2, L3 caches gradually increases, their physical size becomes larger, increasing the physical distance from the CPU core, leading to increased data transfer time and higher element access latency. Under current technology, a multi-level cache structure is the best balance between capacity, speed, and cost.
![Computer Storage System](ram_and_cache.assets/storage_pyramid.png)
!!! note
The storage hierarchy of computers reflects a delicate balance between speed, capacity, and cost. In fact, this kind of trade-off is common in all industrial fields, requiring us to find the best balance between different advantages and limitations.
Overall, **hard disks are used for long-term storage of large amounts of data, memory is used for temporary storage of data being processed during program execution, and cache is used to store frequently accessed data and instructions** to improve program execution efficiency. Together, they ensure the efficient operation of computer systems.
As shown in the figure below, during program execution, data is read from the hard disk into memory for CPU computation. The cache can be considered a part of the CPU, **smartly loading data from memory** to provide fast data access to the CPU, significantly enhancing program execution efficiency and reducing reliance on slower memory.
![Data Flow Between Hard Disk, Memory, and Cache](ram_and_cache.assets/computer_storage_devices.png)
## Memory Efficiency of Data Structures
In terms of memory space utilization, arrays and linked lists have their advantages and limitations.
On one hand, **memory is limited and cannot be shared by multiple programs**, so we hope that data structures can use space as efficiently as possible. The elements of an array are tightly packed without extra space for storing references (pointers) between linked list nodes, making them more space-efficient. However, arrays require allocating sufficient continuous memory space at once, which may lead to memory waste, and array expansion also requires additional time and space costs. In contrast, linked lists allocate and reclaim memory dynamically on a per-node basis, providing greater flexibility.
On the other hand, during program execution, **as memory is repeatedly allocated and released, the degree of fragmentation of free memory becomes higher**, leading to reduced memory utilization efficiency. Arrays, due to their continuous storage method, are relatively less likely to cause memory fragmentation. In contrast, the elements of a linked list are dispersedly stored, and frequent insertion and deletion operations make memory fragmentation more likely.
## Cache Efficiency of Data Structures
Although caches are much smaller in space capacity than memory, they are much faster and play a crucial role in program execution speed. Since the cache's capacity is limited and can only store a small part of frequently accessed data, when the CPU tries to access data not in the cache, a "cache miss" occurs, forcing the CPU to load the needed data from slower memory.
Clearly, **the fewer the cache misses, the higher the CPU's data read-write efficiency**, and the better the program performance. The proportion of successful data retrieval from the cache by the CPU is called the "cache hit rate," a metric often used to measure cache efficiency.
To achieve higher efficiency, caches adopt the following data loading mechanisms.
- **Cache Lines**: Caches don't store and load data byte by byte but in units of cache lines. Compared to byte-by-byte transfer, the transmission of cache lines is more efficient.
- **Prefetch Mechanism**: Processors try to predict data access patterns (such as sequential access, fixed stride jumping access, etc.) and load data into the cache according to specific patterns to improve the hit rate.
- **Spatial Locality**: If data is accessed, data nearby is likely to be accessed in the near future. Therefore, when loading certain data, the cache also loads nearby data to improve the hit rate.
- **Temporal Locality**: If data is accessed, it's likely to be accessed again in the near future. Caches use this principle to retain recently accessed data to improve the hit rate.
In fact, **arrays and linked lists have different cache utilization efficiencies**, mainly reflected in the following aspects.
- **Occupied Space**: Linked list elements occupy more space than array elements, resulting in less effective data volume in the cache.
- **Cache Lines**: Linked list data is scattered throughout memory, and since caches load "by line," the proportion of loading invalid data is higher.
- **Prefetch Mechanism**: The data access pattern of arrays is more "predictable" than that of linked lists, meaning the system is more likely to guess which data will be loaded next.
- **Spatial Locality**: Arrays are stored in concentrated memory spaces, so the data near the loaded data is more likely to be accessed next.
Overall, **arrays have a higher cache hit rate and are generally more efficient in operation than linked lists**. This makes data structures based on arrays more popular in solving algorithmic problems.
It should be noted that **high cache efficiency does not mean that arrays are always better than linked lists**. Which data structure to choose in actual applications should be based on specific requirements. For example, both arrays and linked lists can implement the "stack" data structure (which will be detailed in the next chapter), but they are suitable for different scenarios.
- In algorithm problems, we tend to choose stacks based on arrays because they provide higher operational efficiency and random access capabilities, with the only cost being the need to pre-allocate a certain amount of memory space for the array.
- If the data volume is very large, highly dynamic, and the expected size of the stack is difficult to estimate, then a stack based on a linked list is more appropriate. Linked lists can disperse a large amount of data in different parts of the memory and avoid the additional overhead of array expansion.

View File

@ -0,0 +1,81 @@
# Summary
### Key Review
- Arrays and linked lists are two fundamental data structures, representing two storage methods in computer memory: continuous space storage and dispersed space storage. Their characteristics complement each other.
- Arrays support random access and use less memory; however, they are inefficient in inserting and deleting elements and have a fixed length after initialization.
- Linked lists implement efficient node insertion and deletion through changing references (pointers) and can flexibly adjust their length; however, they have lower node access efficiency and use more memory.
- Common types of linked lists include singly linked lists, circular linked lists, and doubly linked lists, each with its own application scenarios.
- Lists are ordered collections of elements that support addition, deletion, and modification, typically implemented based on dynamic arrays, retaining the advantages of arrays while allowing flexible length adjustment.
- The advent of lists significantly enhanced the practicality of arrays but may lead to some memory space wastage.
- During program execution, data is mainly stored in memory. Arrays provide higher memory space efficiency, while linked lists are more flexible in memory usage.
- Caches provide fast data access to CPUs through mechanisms like cache lines, prefetching, spatial locality, and temporal locality, significantly enhancing program execution efficiency.
- Due to higher cache hit rates, arrays are generally more efficient than linked lists. When choosing a data structure, the appropriate choice should be made based on specific needs and scenarios.
### Q & A
!!! question "Does storing arrays on the stack versus the heap affect time and space efficiency?"
Arrays stored on both the stack and heap are stored in continuous memory spaces, and data operation efficiency is essentially the same. However, stacks and heaps have their own characteristics, leading to the following differences.
1. Allocation and release efficiency: The stack is a smaller memory block, allocated automatically by the compiler; the heap memory is relatively larger and can be dynamically allocated in the code, more prone to fragmentation. Therefore, allocation and release operations on the heap are generally slower than on the stack.
2. Size limitation: Stack memory is relatively small, while the heap size is generally limited by available memory. Therefore, the heap is more suitable for storing large arrays.
3. Flexibility: The size of arrays on the stack needs to be determined at compile-time, while the size of arrays on the heap can be dynamically determined at runtime.
!!! question "Why do arrays require elements of the same type, while linked lists do not emphasize same-type elements?"
Linked lists consist of nodes connected by references (pointers), and each node can store data of different types, such as int, double, string, object, etc.
In contrast, array elements must be of the same type, allowing the calculation of offsets to access the corresponding element positions. For example, an array containing both int and long types, with single elements occupying 4 bytes and 8 bytes respectively, cannot use the following formula to calculate offsets, as the array contains elements of two different lengths.
```shell
# Element memory address = Array memory address + Element length * Element index
```
!!! question "After deleting a node, is it necessary to set `P.next` to `None`?"
Not modifying `P.next` is also acceptable. From the perspective of the linked list, traversing from the head node to the tail node will no longer encounter `P`. This means that node `P` has been effectively removed from the list, and where `P` points no longer affects the list.
From a garbage collection perspective, for languages with automatic garbage collection mechanisms like Java, Python, and Go, whether node `P` is collected depends on whether there are still references pointing to it, not on the value of `P.next`. In languages like C and C++, we need to manually free the node's memory.
!!! question "In linked lists, the time complexity for insertion and deletion operations is `O(1)`. But searching for the element before insertion or deletion takes `O(n)` time, so why isn't the time complexity `O(n)`?"
If an element is searched first and then deleted, the time complexity is indeed `O(n)`. However, the `O(1)` advantage of linked lists in insertion and deletion can be realized in other applications. For example, in the implementation of double-ended queues using linked lists, we maintain pointers always pointing to the head and tail nodes, making each insertion and deletion operation `O(1)`.
!!! question "In the image 'Linked List Definition and Storage Method', do the light blue storage nodes occupy a single memory address, or do they share half with the node value?"
The diagram is just a qualitative representation; quantitative analysis depends on specific situations.
- Different types of node values occupy different amounts of space, such as int, long, double, and object instances.
- The memory space occupied by pointer variables depends on the operating system and compilation environment used, usually 8 bytes or 4 bytes.
!!! question "Is adding elements to the end of a list always `O(1)`?"
If adding an element exceeds the list length, the list needs to be expanded first. The system will request a new memory block and move all elements of the original list over, in which case the time complexity becomes `O(n)`.
!!! question "The statement 'The emergence of lists greatly improves the practicality of arrays, but may lead to some memory space wastage' - does this refer to the memory occupied by additional variables like capacity, length, and expansion multiplier?"
The space wastage here mainly refers to two aspects: on the one hand, lists are set with an initial length, which we may not always need; on the other hand, to prevent frequent expansion, expansion usually multiplies by a coefficient, such as $\times 1.5$. This results in many empty slots, which we typically cannot fully fill.
!!! question "In Python, after initializing `n = [1, 2, 3]`, the addresses of these 3 elements are contiguous, but initializing `m = [2, 1, 3]` shows that each element's `id` is not consecutive but identical to those in `n`. If the addresses of these elements are not contiguous, is `m` still an array?"
If we replace list elements with linked list nodes `n = [n1, n2, n3, n4, n5]`, these 5 node objects are also typically dispersed throughout memory. However, given a list index, we can still access the node's memory address in `O(1)` time, thereby accessing the corresponding node. This is because the array stores references to the nodes, not the nodes themselves.
Unlike many languages, in Python, numbers are also wrapped as objects, and lists store references to these numbers, not the numbers themselves. Therefore, we find that the same number in two arrays has the same `id`, and these numbers' memory addresses need not be contiguous.
!!! question "The `std::list` in C++ STL has already implemented a doubly linked list, but it seems that some algorithm books don't directly use it. Is there any limitation?"
On the one hand, we often prefer to use arrays to implement algorithms, only using linked lists when necessary, mainly for two reasons.
- Space overhead: Since each element requires two additional pointers (one for the previous element and one for the next), `std::list` usually occupies more space than `std::vector`.
- Cache unfriendly: As the data is not stored continuously, `std::list` has a lower cache utilization rate. Generally, `std::vector` performs better.
On the other hand, linked lists are primarily necessary for binary trees and graphs. Stacks and queues are often implemented using the programming language's `stack` and `queue` classes, rather than linked lists.
!!! question "Does initializing a list `res = [0] * self.size()` result in each element of `res` referencing the same address?"
No. However, this issue arises with two-dimensional arrays, for example, initializing a two-dimensional list `res = [[0] * self.size()]` would reference the same list `[0]` multiple times.
!!! question "In deleting a node, is it necessary to break the reference to its successor node?"
From the perspective of data structures and algorithms (problem-solving), it's okay not to break the link, as long as the program's logic is correct. From the perspective of standard libraries, breaking the link is safer and more logically clear. If the link is not broken, and the deleted node is not properly recycled, it could affect the recycling of the successor node's memory.

View File

@ -13,23 +13,23 @@ When we think of data in computers, we imagine various forms like text, images,
The range of values for fundamental data types depends on the size of the space they occupy. Below, we take Java as an example.
- The integer type `byte` occupies 1 byte = 8 bits and can represent \(2^8\) numbers.
- The integer type `int` occupies 4 bytes = 32 bits and can represent \(2^{32}\) numbers.
- The integer type `byte` occupies 1 byte = 8 bits and can represent $2^8$ numbers.
- The integer type `int` occupies 4 bytes = 32 bits and can represent $2^{32}$ numbers.
The following table lists the space occupied, value range, and default values of various fundamental data types in Java. This table does not need to be memorized, but understood roughly and referred to when needed.
<p align="center"> Table <id> &nbsp; Space Occupied and Value Range of Fundamental Data Types </p>
| Type | Symbol | Space Occupied | Minimum Value | Maximum Value | Default Value |
| ------- | -------- | -------------- | -------------------------- | ------------------------- | ---------------- |
| Integer | `byte` | 1 byte | \(-2^7\) (\(-128\)) | \(2^7 - 1\) (\(127\)) | 0 |
| | `short` | 2 bytes | \(-2^{15}\) | \(2^{15} - 1\) | 0 |
| | `int` | 4 bytes | \(-2^{31}\) | \(2^{31} - 1\) | 0 |
| | `long` | 8 bytes | \(-2^{63}\) | \(2^{63} - 1\) | 0 |
| Float | `float` | 4 bytes | \(1.175 \times 10^{-38}\) | \(3.403 \times 10^{38}\) | \(0.0\text{f}\) |
| | `double` | 8 bytes | \(2.225 \times 10^{-308}\) | \(1.798 \times 10^{308}\) | 0.0 |
| Char | `char` | 2 bytes | 0 | \(2^{16} - 1\) | 0 |
| Boolean | `bool` | 1 byte | \(\text{false}\) | \(\text{true}\) | \(\text{false}\) |
| Type | Symbol | Space Occupied | Minimum Value | Maximum Value | Default Value |
| ------- | -------- | -------------- | ------------------------ | ----------------------- | -------------- |
| Integer | `byte` | 1 byte | $-2^7$ ($-128$) | $2^7 - 1$ ($127$) | 0 |
| | `short` | 2 bytes | $-2^{15}$ | $2^{15} - 1$ | 0 |
| | `int` | 4 bytes | $-2^{31}$ | $2^{31} - 1$ | 0 |
| | `long` | 8 bytes | $-2^{63}$ | $2^{63} - 1$ | 0 |
| Float | `float` | 4 bytes | $1.175 \times 10^{-38}$ | $3.403 \times 10^{38}$ | $0.0\text{f}$ |
| | `double` | 8 bytes | $2.225 \times 10^{-308}$ | $1.798 \times 10^{308}$ | 0.0 |
| Char | `char` | 2 bytes | 0 | $2^{16} - 1$ | 0 |
| Boolean | `bool` | 1 byte | $\text{false}$ | $\text{true}$ | $\text{false}$ |
Please note that the above table is specific to Java's fundamental data types. Each programming language has its own data type definitions, and their space occupied, value ranges, and default values may differ.