PostgreSQL Operation Problems

Last Update:2018-12-07 Source: Internet

Author: User

Tags sql client psql

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

PostgreSQL Operation Problems
4.1) how to select only the first few rows of a query result? Or a random row?
If you only want to extract several rows of data and know the exact number of rows in the execution query, you can use the limit function. If an index matches the condition in order by, PostgreSQL may only process the required first few records (otherwise, the entire query will be processed until the required rows are generated ). If you do not know the exact number of records when performing the query function, you can use the cursor and fetch functions.

You can use the following method to extract a random record:

Select Cols
From Tab
Order by random ()
Limit 1;
4.2) how to view tables, indexes, databases, and user definitions? How can I view the query commands used in Psql and display them?
Use the \ DT command in Psql to display the definition of a data table. To understand the complete command list in Psql, use \? In addition, you can also read Psql'sSource codeFile pgsql/src/bin/Psql/describe. C, which includes all the SQL commands output to generate the Psql backslash command. You can also enable Psql with the-e option so that it prints the SQL queries actually used internally to execute the commands you give in Psql. PostgreSQL also provides an information schema interface compatible with SQL, where you can obtain information about the database.

In the system, some system tables with PG _ headers also describe the table definitions.

You can use the Psql-l command to list all databases.

You can also browse the pgsql/src/tutorial/syscat. source file, which lists many select syntaxes that can obtain information from database system tables.

4.3) how to change the data type of a field?
In analyticdb 8.0, it is easy to change the data type of a field. You can use alter table alter column type.

In previous versions, you can do this:

Begin;
Alter table tab add column new_col new_data_type;
Update tab set new_col = cast (old_col as new_data_type );
Alter table tab drop column old_col;
Commit;
You can then use the vacuum full tab command to reclaim the space occupied by invalid data.

4.4) What is the maximum size of a single database for a single row of records?
The following are some restrictions:

What is the maximum size of a database? Unlimited (a 32 TB database already exists)
What is the maximum size of a table? 32 TB
What is the maximum size of a record? 1.6 TB
What is the maximum size of a field? 1 GB
The maximum number of rows in a table? Unlimited
What is the maximum number of columns in a table? 250-1600 (related to column type)
What is the maximum number of indexes in a table? Unlimited

Of course, there is actually no real limit, but it is still subject to the constraints of available disk space, available memory/swap zone. In fact, when these values become abnormally large, the system performance will also be greatly affected.

The maximum size of a table is 32 TB without the support of the operating system for large files. Large tables are stored in multiple 1 GB files, so the size limit of the file system is not important.

If the default block size increases to 32 K, the maximum table size and the maximum number of columns can also be increased to four times.

4.5) How much disk space is required to store data in a typical text file?
The space occupied by an ipvs database (storing a text file) may need up to five times the size of the text file.

For example, assume that there is a file with 100,000 rows, each row has an integer and a text description. Assume that the average length of a text string is 20 bytes. The text file occupies 2.8 mb. The PostgreSQL database file that stores the data is about 6.4 MB:

32 bytes: Header of each line (estimated value)
24 bytes: an integer field and a text field
+ 4 Bytes: pointer to the tuples in the page
----------------------------------------
60 bytes per line

If the size of the PostgreSQL data page is 8192 bytes (8 KB), then:

8192 bytes per page
----------------- = 136 rows/data page (rounded down)
60 bytes per line

100000 data rows
-------------------- = 735 data page (rounded up)
128 rows per page

735 data page * 8192 Bytes/page = 6,021,120 bytes (6 MB)
Indexes do not require so much extra consumption, but they do include the data to be indexed, so they may also be very large.

NULL values are stored in the in-place graph, so they occupy a small amount of space.

4.6) Why is my query slow? Why are these queries not using indexes?
Not every query uses an Index automatically. Indexes are used only when the table size exceeds a minimum value and only a small proportion of records in the table are selected for query. This is because access to the disk immediately caused by index scanning may be slower than reading the table directly (Sequential Scan.

To determine whether an index is used, PostgreSQL must obtain the statistical value of the table. These statistical values can be obtained using vacuum analyze or analyze. Using the statistical value, the optimizer knows the number of rows in the table to better determine whether to use the index. The statistical value is also useful for determining the optimized connection sequence and method. When the table content changes, the statistical value should be updated and collected on a regular basis.

Indexes are generally not used for order by or join execution. An explicit sorting of a large table is usually faster than an index scan.

However, indexes are often used when limit and order by are used in combination, because only a small part of the table is returned. In fact, although max () and min () do not use indexes, it is also possible to obtain the maximum and minimum values by using indexes for order by and llimit:

Select col
From Tab
Order by col [DESC]
Limit 1;
If you are sure that the PostgreSQL optimizer uses an incorrect Sequential Scan, you can use the set enable_seqscan to 'off' command and run the query again, you can see whether an index scan is faster.

When using wildcards, such as like or ~ The index can only be used under specific circumstances:

The start part of the string must be a normal string, that is:
The like mode cannot start with %.
~ (Regular Expression) The pattern must start with ^.
A string cannot start with a pattern that matches multiple characters, for example, [A-E].
Case-insensitive searches, such as Ilike and ~ * If no index is used, you can use the function index described in section 4.8.
The default local Setting C locale must be used for initdb, because the system cannot know what the next maximum character is in non-C locale situations. In this case, you can create a special text_pattern_ops index for like indexes.
In versions earlier than 8.0, indexes are often not used unless the data type to be queried matches the index data type, especially for int2, int8, and numeric indexes.

4.7) How can I see how the query optimizer evaluates and processes my queries?
Refer to the explain manual page.

4.8) How do I search for a case-insensitive regular expression? How can I use indexes for case-insensitive searches?
Operator ~ Process Regular Expression matching, and ~ * Process case-insensitive Regular Expression matching. In upper case, irrelevant like variants become Ilike.

The case-insensitive equations are usually written as follows:

Select *
From Tab
Where lower (COL) = 'abc ';
Standard indexes will not be used. However, you can create an available function index:

Create index tabindex on tab (lower (COL ));
4.9) in a query, how do I check whether a field is null? How can I accurately sort data, regardless of whether a field contains null values?
Use is null and is not null to test this field. The specific method is as follows:

Select *
From Tab
Where col is null;
To sort null fields, use the is null and is not null modifiers in the order by condition. If the condition is true, the ratio of false to false is ranked first, in the following example, null records are placed in the top part of the result:

Select *
From Tab
Order by (COL is not null)
4.10) What are the differences between character types?
Type Internal Name Description
Varchar (n) varchar specifies the maximum length and variable length string. The unspecified length is not filled.
Char (n) bpchar is a fixed-length string. When the actual data is insufficient to define the length, fill it with spaces.
Text text has no special upper limit (only limited by the maximum length of rows)
Bytea variable-length byte sequence (null is also allowed)
"Char" char a character

In the system table and in some error messages, you will see the internal name.

The first four types listed above are "varlena" (variable length) (that is, the four first bytes are the length, followed by the data ). Therefore, the actual occupied space is more than the declared size. However, these types can be compressed or stored offline using toast, so the disk space may be less than expected.

Varchar (n) is the best in storing variable-length strings with limited maximum length. Text is applicable to strings with a maximum length of 1 GB but undefined length.

Char (n) is most suitable for storing strings of the same length. Char (n) fills in Space Based on the given field length (insufficient field content), while varchar (n) only stores the given data content. Bytea is used to store binary data, especially values that contain null bytes. These types have similar performance characteristics.

4.11.1) How do I create a serial number/automatically incrementing field?
PostgreSQL supports the serial data type. It automatically creates a sequence and index on the field. For example:

Create Table person (
Id serial,
Name text
);
Will be automatically converted:

Create sequence person_id_seq;
Create Table person (
Id int4 not null default nextval ('person _ id_seq '),
Name text
);
For more information about sequences, see the create_sequence manual.

4.11.2) How do I obtain the value of an inserted serial number?
One way is to first use the nextval () function to retrieve the next serial value from the sequence object before inserting it, And then explicitly insert it. The example table in 4.11.1 can be described as follows:

New_id = execute ("select nextval ('person _ id_seq ')");
Execute ("insert into person (ID, name) values (new_id, 'ise ise Pascal ')");
In this way, the new value stored in new_id can be used in other queries (for example, as the foreign key of the person table ). Note that the name of the automatically created sequence object will be <Table >_< serialcolumn> _ seq. Here, table and serialcolumn are the names of your table and your serial fields respectively.

Similarly, after the serial object is inserted by default, you can use the currval () function to retrieve the newly assigned serial value. For example:

Execute ("insert into person (name) values ('ise ise Pascal ')");
New_id = execute ("select currval ('person _ id_seq ')");
4.11.3) Does using currval () cause a conflict with other users (Race Condition?
No. Currval () returns the value assigned by the current session process rather than the current value of all users.

4.11.4) Why not reuse the serial number after the transaction is aborted due to an exception? Why is there interruption in the value of the serial number field?
To improve concurrency, the serial number is assigned to a running transaction as needed and is not locked before the transaction ends. This will cause an interval of the serial number after the transaction is aborted unexpectedly.

4.12) What is oId? What is ctid?
Each row of record created in PostgreSQL will obtain a unique OID, unless the without oids option is used during table creation. When the oId is created, a 4-byte integer is automatically generated. All OID types are unique throughout PostgreSQL. However, when it exceeds 4 billion, it will overflow, and the oId will repeat later. PostgreSQL uses oId in its internal system table to establish a connection between tables.

In a user's data table, it is best to use serial instead of oId. Because serial only ensures that the data in a single table is unique, the possibility of overflow is very small, serial8 can be used to save 8-byte Serial Number Fields.

Ctid is used to identify a specific physical row with a data block (address) and (within the block) offset. Ctid changes after the record is changed or reloaded. The index entry points to physical rows.

4.13) Why do I receive the error message "error: Memory Exhausted in allocsetalloc ()"?
This is probably because the system's virtual memory is used up, or the kernel has a low limit on some resources. Try the following command before starting Postmaster:

Ulimit-D 262144
Limit datasize 256 m
Depending on the shell you use, the above command can only be successful, but it will set the data segment limit of your process to a relatively high level, so it may be able to complete the query. This command is applied to the current process and all sub-processes created after this command is run. If you encounter a problem when running the SQL client because too much data is returned in the background, run the preceding command before running the client.

4.14) How can I know the version of postgresql running?
From Psql, enter select version (); command.

4.15) How do I create a field whose default value is the current time?
Use current_timestamp:

Create Table Test (x int, modtime timestamp default current_timestamp );
4.16) How do I perform Outer Join (Outer Join )?
PostgreSQL uses standard SQL syntax to support external connections. Here are two examples:

Select *
From T1 left Outer Join T2 on (t1.col = t2.col );
Or

Select *
From T1 left Outer Join T2 using (COL );
The two equivalent queries connect t1.col and t2.col, and return all unconnected rows in T1 (those that do not match in T2 ). Right [outer] Join returns the rows that are not connected in T2. Full outer join returns the rows that are not connected in T1 and T2. The keyword outer is optional in the left [outer] connection, right [outer] connection, and full [outer] connection. A common connection is called an inner join ).

4.17) How to Use queries involving multiple databases?
There is no way to query databases other than the current database. Because PostgreSQL needs to load database-related system directories (system tables), it is not clear how to execute cross-database queries.

The added value-added modules contrib/dblink allow cross-database queries by using function calls. Of course, you can also connect to different databases to perform queries and then merge the results on the client.

4.18) How can I have the function return multiple rows or columns?
The ability to return a data record set in a function is easy to use, see: http://techdocs.postgresql.org/guides/SetReturningFunctions

4.19) Why do I receive the error "relation with OID #### does not exist" when using the PL/pgsql function to access a temporary table "?
PL/pgsql caches the function content. A bad side is that if a PL/pgsql function accesses a temporary table, the table is deleted and rebuilt, then, calling this function again will fail because the cached function content still points to the old temporary table. The solution is to use execute in PL/pgsql to access the temporary table. This will ensure that the query will always be re-parsed before execution.

4.27) what data replication solutions are currently available?
"Replication" is just a term. There are several Replication technologies available, each of which has advantages and disadvantages:

Master/Slave replication allows one master server to accept read/write requests, while multiple slave servers can only accept read/SELECT query requests, currently, the most popular and free Master/Slave PostgreSQL replication solution is slony-I.

The replication mode of multiple master servers allows sending read/write requests to multiple computers. This mode may cause serious performance loss due to the need to synchronize data changes between multiple servers, pgcluster is currently the best in this solution and can be downloaded for free.

There are also some commercial pay-as-you-go and hardware-based Data Replication solutions that support the aforementioned replication models.

From: http://www.xxlinux.com/linux/article/development/database/20060906/3988_2.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More