Composite types v15

NameNativeAliasDescription
CREATE TYPE <type_name> AS (col1 data type, col2 datatype,..)Structure of a row or record

Overview

A composite type represents the structure of a row or record. It's essentially a list of field names and their data types. PostgreSQL allows you to use composite types in many of the same ways that you can use simple types. For example, you can declare a column of a table as a composite type.

Declaration of composite types

These two simple examples define composite types:

CREATE TYPE complex AS (
    r       double precision,
    i       double precision
);

CREATE TYPE inventory_item AS (
    name            text,
    supplier_id     integer,
    price           numeric
);

The syntax is comparable to CREATE TABLE, except that you can specify only field names and types. Currently, you can't include constraints (such as NOT NULL). The AS keyword is essential. Without it, the system might interpret it as a different kind of CREATE TYPE command, and you can get odd syntax errors.

Having defined the types, you can use them to create tables:

CREATE TABLE on_hand (
    item      inventory_item,
    count     integer
);

INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);

You can also use them to create functions:

CREATE FUNCTION price_extension(inventory_item, integer) RETURNS numeric
AS 'SELECT $1.price * $2' LANGUAGE SQL;

SELECT price_extension(item, 10) FROM on_hand;

Creating a table also creates a composite type to represent the table's row type. The composite type has the same name as the table. For example, suppose you entered:

CREATE TABLE inventory_item (
    name            text,
    supplier_id     integer REFERENCES suppliers,
    price           numeric CHECK (price > 0)
);

In this cae, the same inventory_item composite type shown in the previous example becomes a byproduct, and you can use it as in the previous example. However, there's an important restriction of the current implementation. Since no constraints are associated with a composite type, the constraints shown in the table definition don't apply to values of the composite type outside the table. (A partial workaround is to use domain types as members of composite types.)

Constructing composite values

To write a composite value as a literal constant, enclose the field values in parentheses and separate them with commas. You can put double quotes around any field value and must do so if it contains commas or parentheses. Therefore, the general format of a composite constant is:

'( val1 , val2 , ... )'

An example is:

'("fuzzy dice",42,1.99)'

This is a valid value of the inventory_item type defined earlier. To make a field NULL, enter no characters in its position in the list. For example, this constant specifies a NULL third field:

'("fuzzy dice",42,)'

If you want an empty string rather than NULL, use double quotes:

'("",42,)'

Here the first field is a non-NULL empty string. The third is NULL.

Note

These constants are actually only a special case of the generic type constants discussed in Constants of Other Types. The constant is initially treated as a string and passed to the composite-type input conversion routine. An explicit type specification might be necessary to tell which type to convert the constant to.

You can also use the ROW expression syntax to construct composite values. In most cases, this is considerably simpler to use than the string-literal syntax since you don't have to use multiple layers of quoting. The earlier example already used this method:

ROW('fuzzy dice', 42, 1.99)
ROW('', 42, NULL)

The ROW keyword is optional as long as you have more than one field in the expression. So these can be simplified to:

('fuzzy dice', 42, 1.99)
('', 42, NULL)

Accessing composite types

To access a field of a composite column, write a dot and the field name, much like selecting a field from a table name. In fact, it's so much like selecting from a table name that you often have to use parentheses to keep from confusing the parser. For example, you might try to select some subfields from the on_hand example table with something like:

SELECT item.name FROM on_hand WHERE item.price > 9.99;

However, that doesn't work since the name item is taken to be a table name, not a column name of on_hand, per SQL syntax rules. You must write it like this:

SELECT (item).name FROM on_hand WHERE (item).price > 9.99;

If you need to use the table name as well (for instance, in a multitable query), write it like this:

SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;

Now the parenthesized object is correctly interpreted as a reference to the item column, and then the subfield can be selected from it.

Similar syntactic issues apply whenever you select a field from a composite value. For instance, to select just one field from the result of a function that returns a composite value, you need to write something like:

SELECT (my_func(...)).field FROM ...

Without the extra parentheses, this command generates a syntax error.

Modifying composite types

These examples show the proper syntax for inserting and updating composite columns. First, insert or update a whole column:

INSERT INTO mytab (complex_col) VALUES((1.1,2.2));

UPDATE mytab SET complex_col = ROW(1.1,2.2) WHERE ...;

The first example omits ROW, and the second uses it. You can do it either way.

You can update an individual subfield of a composite column:

UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...;

Notice that you don't need to (and can't) put parentheses around the column name appearing just after SET. But you do need parentheses when referencing the same column in the expression to the right of the equal sign.

You can specify subfields as targets for INSERT, too:

INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);

If you don't supply values for all the subfields of the column, the remaining subfields are filled with null values.

Using composite types in queries

Various special syntax rules and behaviors are associated with composite types in queries. These rules provide useful shortcuts but can be confusing if you don't know the logic behind them.

In PostgreSQL, a reference to a table name (or alias) in a query is effectively a reference to the composite value of the table's current row. For example, if you have a table inventory_item as shown in Declaration of composite types, you can write:

SELECT c FROM inventory_item c;

This query produces a single composite-valued column, so you might get output like:

Output
           c
------------------------
 ("fuzzy dice",42,1.99)
(1 row)

Simple names are matched to column names before table names, so this example works only because there isn't a column named c in the query's tables.

The ordinary qualified-column-name syntax table_name.column_name can be understood as applying field selection to the composite value of the table's current row. For efficiency reasons, it's not actually implemented that way.

Suppose you write:

SELECT c.* FROM inventory_item c;

Then, according to the SQL standard, you get the contents of the table expanded into separate columns:

Output
    name    | supplier_id | price
------------+-------------+-------
 fuzzy dice |          42 |  1.99
(1 row)

This behavior is the same as if the query were:

SELECT c.name, c.supplier_id, c.price FROM inventory_item c;

PostgreSQL applies this expansion behavior to any composite-valued expression, although as shown in Accessing composite types, you need to write parentheses around the value that .* is applied to whenever it's not a simple table name. For example, if myfunc() is a function returning a composite type with columns a, b, and c, then these two queries have the same result:

SELECT (myfunc(x)).* FROM some_table;
SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
Tip

PostgreSQL handles column expansion by transforming the first form into the second. So, in this example, myfunc() gets invoked three times per row with either syntax. If it's an expensive function, you might want to avoid that, which you can do with a query like:

SELECT (m).* FROM (SELECT myfunc(x) AS m FROM some_table OFFSET 0) ss;

The OFFSET 0 clause keeps the optimizer from "flattening" the sub-select to arrive at the form with multiple calls of myfunc().

The composite_value.* syntax results in column expansion of this kind when it appears at the top level of a SELECT output list, a RETURNING list in INSERT/UPDATE/DELETE, a VALUES clause, or a row constructor. In all other contexts (including when nested inside one of those constructs), attaching .* to a composite value doesn't change the value, since it means "all columns," and so the same composite value is produced again. For example, if somefunc() accepts a composite-valued argument, these queries are the same:

SELECT somefunc(c.*) FROM inventory_item c;
SELECT somefunc(c) FROM inventory_item c;

In both cases, the current row of inventory_item is passed to the function as a single composite-valued argument. Even though .* does nothing in such cases, using it is good style, since it makes clear that a composite value is intended. In particular, the parser considers c in c.* to refer to a table name or alias, not to a column name, so that there's no ambiguity. Without .*, it isn't clear whether c means a table name or a column name, and in fact the column-name interpretation is preferred if there's a column named c.

Another example demonstrating these concepts is that all these queries mean the same thing:

SELECT * FROM inventory_item c ORDER BY c;
SELECT * FROM inventory_item c ORDER BY c.*;
SELECT * FROM inventory_item c ORDER BY ROW(c.*);

All of these ORDER BY clauses specify the row's composite value. However, if inventory_item contains a column named c, the first case is different from the others, as it means to sort by that column only. Given the column names previously shown, these queries are also equivalent to them:

SELECT * FROM inventory_item c ORDER BY ROW(c.name, c.supplier_id, c.price);
SELECT * FROM inventory_item c ORDER BY (c.name, c.supplier_id, c.price);

Another special syntactical behavior associated with composite values is that you can use functional notation for extracting a field of a composite value. The simple way to explain this is that the notations field(table) and table.field are interchangeable. For example, these queries are equivalent:

SELECT c.name FROM inventory_item c WHERE c.price > 1000;
SELECT name(c) FROM inventory_item c WHERE price(c) > 1000;

Moreover, if you have a function that accepts a single argument of a composite type, you can call it with either notation. These queries are all equivalent:

SELECT somefunc(c) FROM inventory_item c;
SELECT somefunc(c.*) FROM inventory_item c;
SELECT c.somefunc FROM inventory_item c;

This equivalence between functional notation and field notation makes it possible to use functions on composite types to implement computed fields. An application using this last query doesn't need to be directly aware that somefunc isn't a real column of the table.

Tip

Because of this behavior, we don't recommend giving a function that takes a single composite-type argument the same name as any of the fields of that composite type. If there's ambiguity, the field-name interpretation is preferred, so that such a function can't be called without tricks. One way to force the function interpretation is to schema-qualify the function name, that is, write schema.func(compositevalue).

Composite type input and output syntax

The external text representation of a composite value consists of items that are interpreted according to the I/O conversion rules for the individual field types, plus decoration that indicates the composite structure. The decoration consists of parentheses, that is, ( and ), around the whole value, plus commas (,) between adjacent items. Whitespace outside the parentheses is ignored. However, within the parentheses it's considered part of the field value and might be significant depending on the input conversion rules for the field data type. For example:

'(  42)'

The whitespace is ignored if the field type is an integer but not if it's text.

As shown previously, when writing a composite value, you can use double quotes around any individual field value. You must do so if the field value would otherwise confuse the composite-value parser. In particular, fields containing parentheses, commas, double quotes, or backslashes must be double-quoted. To put a double quote or backslash in a quoted composite field value, precede it with a backslash. Also, a pair of double quotes within a double-quoted field value is taken to represent a double-quote character, analogous to the rules for single quotes in SQL literal strings. Alternatively, you can avoid quoting and use backslash-escaping to protect all data characters that would otherwise be taken as composite syntax.

A completely empty field value (no characters at all between the commas or parentheses) represents a NULL. To write a value that's an empty string rather than NULL, write "".

The composite output routine puts double quotes around field values if they're empty strings or contain parentheses, commas, double quotes, backslashes, or white space. Doing so for white space isn't essential but aids legibility. Double quotes and backslashes embedded in field values are doubled.

Note

What you write in a SQL command is first interpreted as a string literal and then as a composite. This behavior doubles the number of backslashes you need, assuming you use escape-string syntax. For example, to insert a text field containing a double quote and a backslash in a composite value, enter:

INSERT ... VALUES ('("\"\\")');

The string-literal processor removes one level of backslashes, so that what arrives at the composite-value parser looks like ("\"\\"). In turn, the string fed to the text data type's input routine becomes "\. If you're working with a data type whose input routine also treated backslashes specially, bytea for example, you might need as many as eight backslashes in the command to get one backslash into the stored composite field. You can use dollar quoting to avoid the need to double backslashes.

Tip

The ROW constructor syntax is usually easier to work with than the composite-literal syntax when writing composite values in SQL commands. In ROW, write individual field values the same way as when they aren't members of a composite.