資料庫基本理論詳細介紹-mysql教程-PHP中文網

1、資料庫範式

第一範式：列不可分，eg:【聯絡人】（姓名，性別，電話），一個聯絡人有家庭電話和公司電話，那麼這種表結構設計就沒有達到1NF；

第二範式：有主鍵，保證完全依賴。 eg:訂單明細表【OrderDetail】（OrderID，ProductID，UnitPrice，Discount，Quantity，ProductName），Discount（折扣），Quantity（數量）完全依賴（取決）於主鍵（OderID，ProductID），而UnitPrice，ProductName 只依賴於主鍵（OderID，ProductID），而UnitPrice，ProductName 只依賴於ProductID，不符合2NF；

第三範式：無傳遞依賴(非主鍵列A 依賴非主鍵列B，非主鍵列B 依賴主鍵的情況)，eg:訂單表【Order】（OrderID，OrderDate，CustomerID，CustomerName，CustomerAddr，CustomerCity）主鍵是（OrderID），CustomerName，CustomerAddr，CustomerCity 直接依賴的是CustomerID（非主鍵列），而不是直接依賴主鍵，它是透過傳遞才依賴於主鍵列），而不是直接依賴主鍵，它是透過傳遞才依賴於主鍵，所以不符合3NF。

2、什麼是反模式

範式可以避免資料冗餘，減少資料庫的空間，並減輕維護資料完整性的麻煩。

然而，透過資料庫範式化設計，將導致資料庫業務涉及的表變多，並且可能需要將涉及的業務表進行多表連接查詢，這將導致效能變差，且不利於分庫分錶。因此，出於效能優先的考量，可能在資料庫的結構中需要使用反模式的設計，即空間換取時間，採取資料冗餘的方式避免表之間的關聯查詢。至於資料一致性問題，因為難以滿足資料強一致性，一般情況下，使儲存資料盡可能達到使用者一致，確保系統經過一段較短的時間的自我恢復和修正，資料最終達到一致。

需要謹慎使用反模式設計資料庫。一般情況下，盡可能使用範式化的資料庫設計，因為範式化的資料庫設計能讓產品更靈活，並且能在資料庫層保持資料完整性。

有的時候，提升性能最好的方法是在同一表中保存冗餘數據，如果能容許少量的髒數據，創建一張完全獨立的匯總表或緩存表是非常好的方法。舉個例子，設計一張「下載次數表」來快取下載次數信息，可使在海量資料的情況下，提高查詢總數資訊的速度。

另外一個比較典型的場景，出於擴展性考慮，可能會使用BLOB 和TEXT 類型的列存儲JSON 結構的數據，這樣的好處在於可以在任何時候，將新的屬性添加到這個在字段中，而不需要更改表結構。但是，這個設計的缺點也比較明顯，就是需要取得整個欄位內容進行解碼來取得指定的屬性，並且無法進行索引、排序、聚合等操作。因此，如果需要考慮更複雜的使用場景，更加建議使用 MongoDB 這樣的文檔型資料庫。

3、資料庫事務

交易是一個不可分割的資料庫操作序列，也是資料庫並發控制的基本單位，其執行的結果必須使資料庫從一種一致性狀態變成另一種一致性狀態。

(1). 事務的特徵

原子性(Atomicity)：事務所包含的一系列資料庫操作要麼全部成功執行，要麼全部回滾；

#性(Consistency)：交易的執行結果必須使資料庫從一個一致性狀態到另一個一致性狀態；

隔離性(Isolation)：並發執行的事務之間不能相互影響；

持久性(Durability)：交易一旦提交，對資料庫中資料的改變是永久性的。

(2). 交易並發帶來的問題

髒讀：一個交易讀取了另一個交易未提交的資料；

不可重複讀取：不可重複讀的重點是修改，同樣條件下兩次讀取結果不同，也就是說，被讀取的資料可以被其它事務修改；

幻讀：幻讀的重點在於新增或刪除，同樣條件下兩次讀出來的記錄數不一樣。

(3). 隔離等級

隔離等級決定了一個session中的事務可能對另一個session中的事務的影響。

ANSI標準定義了4個隔離級別，MySQL的InnoDB都支持，分別是：

READ UNCOMMITTED（未提交讀取）：最低級別的隔離，通常又稱為dirty read，它允許一個事務讀取另一個事務還沒commit的數據，這樣可能會提高效能，但是會導致髒讀問題；

READ COMMITTED（提交讀取）：在一個事務中只允許對其它事務已經commit的記錄可見，該隔離等級不能避免不可重複讀取問題；

REPEATABLE READ（可重複讀取）：在一個事務開始後，其他事務對資料庫的修改在本事務中不可見，直到本事務commit或rollback。但是，其他事務的insert/delete操作對該事務是可見的，也就是說，該隔離等級並不能避免幻讀問題。在一個事務中重複select的結果一樣，除非本事務中update資料庫。

SERIALIZABLE (serializable): The highest level of isolation, only allowing transactions to be executed serially.

The default isolation level of MySQL is REPEATABLE READ.

資料庫基本理論詳細介紹

4. What is a stored procedure? What are the advantages and disadvantages?

A stored procedure is a collection of SQL statements that have been compiled in advance and stored in the database. Furthermore, a stored procedure is a code block composed of some T-SQL statements. These T-SQL statement codes implement some functions like a method (add, delete, modify and query a single table or multiple tables), and then give this code block Just give it a name and call it when you use this function. Stored procedures have the following characteristics:

4.1. Stored procedures are only compiled when they are created. There is no need to recompile each time the stored procedure is executed in the future. Generally, SQL statements are compiled once every time they are executed, so using storage The process can improve the execution efficiency of the database;

4.2. When the SQL statement changes, you can only modify the stored procedure in the database without modifying the code;

4.3. Reduce network transmission, on the client Of course, the amount of data transferred by calling a stored procedure is smaller than executing a series of SQL;

4.4. Through stored procedures, users without permission can indirectly access the database under control, thereby ensuring data security.

5. Let’s briefly talk about the differences between drop, delete, and truncate.

drop, delete, and truncate in SQL all mean deletion, but there are some differences among the three. Difference:

Delete is used to delete all or part of the data rows in the table. After executing delete, the user needs to commit or rollback to perform deletion or undelete. The delete command will trigger the operation on this table. All delete triggers;

Truncate deletes all data in the table. This operation cannot be rolled back and will not trigger triggers on this table. TRUNCATE is faster than delete and takes up less space;

The Drop command deletes a table from the database. All data rows, indexes and permissions will also be deleted. All DML triggers will not be triggered, and this command cannot be rolled back.

Therefore, when a table is no longer needed, use drop; when you want to delete some data rows, use delete; when retaining the table and deleting all data, use truncate.

6. What is a view? What is a cursor?

A view is a virtual table, usually a subset of rows or columns in one table or multiple tables. It has the same functions as a physical table and can be added to the view. , delete, modify, check and other operations. In particular, modifications to the view do not affect the underlying table. Compared with multi-table queries, it makes it easier for us to obtain data.

The cursor effectively processes the query result set as a unit. The cursor can be positioned on a specific row in the cell to retrieve one or more rows from the current row in the result set. You can make changes to the current row of the result set. Cursors are generally not used, but when data needs to be processed one by one, cursors are very important.

When operating mysql, we know that the MySQL retrieval operation returns a set of rows called a result set. The set of rows returned are all rows that match the SQL statement (zero or more rows). With a simple SELECT statement, for example, there is no way to get the first row, the next row, or the first 10 rows, nor is there an easy way to process all rows one at a time (as opposed to processing them in batches). Sometimes it is necessary to move forward or backward one or more rows in the retrieved rows. This is why cursors are used. A cursor is a database query stored on the MySQL server. It is not a SELECT statement, but a result set retrieved by the statement. After the cursor is stored, the application can scroll or browse the data within it as needed. Cursors are primarily used in interactive applications where the user needs to scroll through data on the screen and browse or make changes to the data.

7. What is a trigger?

A trigger is a database object related to a table that is triggered when defined conditions are met and executes the set of statements defined in the trigger. This feature of triggers can help applications ensure database integrity on the database side.

8. Super key, candidate key, primary key, foreign key

Super key: The set of attributes that can uniquely identify a tuple in a relationship is called the super key of the relational schema. key. An attribute can be used as a super key, and multiple attributes combined can also be used as a super key. Super key contains candidate key and primary key.

Candidate key: It is the minimum superkey, that is, a superkey without redundant elements.

Primary key: A combination of data columns or attributes in a database table that uniquely and completely identifies the stored data object. A data column can only have one primary key, and the value of the primary key cannot be missing, that is, it cannot be null.

Foreign key: The primary key of another table that exists in one table is called the foreign key of this table.

9. What is a transaction? What is a lock?

Transaction: It is a group of SQL statements that are bound together as a logical unit of work. If any statement operation fails, the entire operation will fail, and future operations will be rolled back to the state before the operation, or there will be a node. To ensure that something is either executed or not executed, transactions can be used. For a grouped statement to be considered a transaction, it needs to pass the ACID tests, namely atomicity, consistency, isolation, and durability.

Lock: In all DBMS, locks are the key to realizing transactions. Locks can ensure the integrity and concurrency of transactions. Just like a lock in real life, it can prevent the owner of certain data from using certain data or data structures for a certain period of time. Of course, locks are also divided into levels.

10. Database locking mechanism

The database locking mechanism simply means that in order to ensure the consistency of the data, the database allows various shared resources to be accessed concurrently, and the access changes. A rule designed in an orderly manner. Each MySQL storage engine uses three types (levels) of locking mechanisms: row-level locking, page-level locking and table-level locking.

Table-level locking (table-level): Table-level locking is the most granular locking mechanism among MySQL storage engines. The biggest feature of this locking mechanism is that the implementation logic is very simple and brings minimal negative impact on the system. So acquiring and releasing locks is very fast. Since table-level locks lock the entire table at one time, the deadlock problem that plagues us can be avoided. Of course, the biggest negative impact of large locking granularity is that the probability of contention for locking resources will be the highest, which will greatly reduce the efficiency. Table-level locks are divided into read locks and write locks. Page-level locking (page-level): The characteristic of page-level locking is that the locking granularity is between row-level locking and table-level locking, so the resource overhead required to obtain the lock and the concurrent processing capability it can provide are also Between the above two. In addition, page-level locking and row-level locking will cause deadlock. Row-level locking (row-level): The biggest feature of row-level locking is that the granularity of the locked object is very small, and it is also the smallest locking granularity achieved by currently major database management software. Since the locking granularity is very small, the probability of contention for locking resources is also minimal, which can give the application as much concurrent processing capability as possible and improve the overall performance of some application systems that require high concurrency. Although it has great advantages in concurrent processing capabilities, row-level locking also brings many disadvantages. Since the granularity of locking resources is very small, more things need to be done each time to acquire and release the lock, which naturally causes greater consumption. In addition, row-level locking is also the most prone to deadlock. InnoDB's row-level locks are also divided into two types, shared locks and exclusive locks. Similarly, InnoDB also introduced the concept of intention locks (table-level locks), so there are intention shared locks and intention exclusive locks, so InnoDB actually has Four types of locks, namely shared lock (S), exclusive lock (X), intention shared lock (IS), and intention exclusive lock (IX);

In the MySQL database, table-level locking is mainly used by MyISAM , Memory, CSV and other non-transactional storage engines. Row-level locking is mainly used by the Innodb storage engine and NDBCluster storage engine. Page-level locking is mainly the locking method of the BerkeleyDB storage engine.

The function of intention lock is that when a transaction needs to obtain a resource lock, if the resource it needs is already occupied by an exclusive lock, the transaction can add an appropriate row to the table that needs to lock the row. Intention lock. If you need a shared lock, add an intention shared lock on the table. If you need to add an exclusive lock on a certain row (or some rows), first add an intention exclusive lock on the table. Multiple intention shared locks can exist at the same time, but only one intention exclusive lock can exist at the same time.

資料庫基本理論詳細介紹