PostgreSQL guts: What is "resjunk"?

November 19, 2013

I’m delving into the PostgreSQL parser, query rewriter and query planner at the moment, as part of work on row-level security for the AXLE project. As I’ve noticed that there’s some great documentation on the overall structure and flow but not much on some of the details I thought I’d start posting about some of the more confusing corners.

If you’re not interested in the PostgreSQL source code and its inner workings you can stop reading now.

resjunk

Today’s topic is the term “resjunk”, which refers to resjunk target-list attribute. You’ll see this term throughout the planner and rewriter, usually as assumed knowledge. The name is not exceedingly helpful.

resjunk columns are described in src/backend/executor/execJunk.c, where there’s a moderately detailed comment. It doesn’t really explain the overarching ideas, though.

The concept is that sometimes PostgreSQL needs to keep track of per-tuple information that isn’t part of the query output. It might be a sort key that’s not part of the select list, an intermediate result from a subquery that’s used as a filter then discarded, or it might be an internal column like ctid that isn’t exposed to users.

Plan nodes have target lists – these are lists of the columns output by that plan node. For a simple SELECT a, b FROM test the columns a and b will appear in the target-list of the index or seqscan plan node for the query. You can observe this yourself by enabling plan logging, per the following trimmed output:


regress=> CREATE TABLE 
regress=> SET enable_print_plan = on;
regress=> SET client_min_messages = debug;
regress=> SELECT a, b FROM test;
LOG:  plan:
DETAIL:     {PLANNEDSTMT 
   :commandType 1 
   :queryId 0 
   ....
   :planTree 
      {SEQSCAN 
      :startup_cost 0.00 
      :total_cost 29.40 
      :plan_rows 1940 
      :plan_width 12 
      :targetlist (
         {TARGETENTRY 
         :expr 
            {VAR 
            :varno 1 
            :varattno 1 
            ...
            :location 7
            }
         ...
         :resjunk false
         }
         {TARGETENTRY 
         :expr 
            {VAR 
            :varno 1 
            :varattno 2 
            ...
            :location 10
            }
         ....
         :resjunk false
         }
      )
      :qual  
      :lefttree  
      :righttree  
      :initPlan  
      :extParam (b)
      :allParam (b)
      :scanrelid 1
      }
   :rtable (
      {RTE 
      :alias  
      :eref 
         {ALIAS 
         :aliasname test 
         :colnames ("a" "b")
         }
      ...
      :selectedCols (b 9 10)
      :modifiedCols (b)
      }
   )
   ....
   }

That’s the detailed plan for:


                       QUERY PLAN                       
--------------------------------------------------------
 Seq Scan on test  (cost=0.00..29.40 rows=1940 width=8)

In it you will see that the SELECT has two entries in the target-list, one for each column. Neither is resjunk since both are output by the query.

What if we add a sort by column c, which isn’t in the SELECT-list, we’ll see a new column added to the target list and marked as resjunk:


regress=> SELECT a, b FROM test ORDER BY c;
LOG:  plan:
DETAIL:     {PLANNEDSTMT 
   :commandType 1 
   ....
   :planTree 
      {SORT 
      ....
      :targetlist (
         {TARGETENTRY 
         :expr 
            {VAR 
            :varno 65001 
            :varattno 1
            ...
            }
         :resno 1 
         :resname a 
         ...
         :resjunk false
         }
         {TARGETENTRY 
         :expr 
            {VAR 
            :varno 65001 
            :varattno 2 
            ...
            }
         :resno 2 
         :resname b 
         ....
         :resjunk false
         }
         {TARGETENTRY 
         :expr 
            {VAR 
            :varno 65001 
            :varattno 3 
            ...
            }
         :resno 3 
         :resname  
         ....
         :resjunk true
         }
      )
      :qual  
      :lefttree 
         {SEQSCAN 
         ...
         :targetlist (
            {TARGETENTRY 
            :expr 
               {VAR 
               :varno 1 
               :varattno 1 
               ...
               }
            :resno 1 
            ...
            :resjunk false
            }
            {TARGETENTRY 
            :expr 
               {VAR 
               :varno 1 
               :varattno 2 
               ...
               }
            :resno 2 
            ...
            :resjunk false
            }
            {TARGETENTRY 
            :expr 
               {VAR 
               :varno 1 
               :varattno 3 
               ...
               }
            :resno 3
            ...
            :resjunk true
            }
         )
         ....
      }
   :rtable (
      {RTE 
      :alias  
      :eref 
         {ALIAS 
         :aliasname test 
         :colnames ("a" "b" "c")
         }
      ....
      :selectedCols (b 9 10 11)
      :modifiedCols (b)
      }
   )
   ....
   }

for the query plan:


                          QUERY PLAN                           
---------------------------------------------------------------
 Sort  (cost=135.34..140.19 rows=1940 width=12)
   Sort Key: c
   ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=12)
(3 rows)

So c is marked resjunk because it’s a sort key that isn’t part of the final plan output.

You’ll also see ctid marked resjunk in UPDATE and DELETE plans for similar reasons – the read part of the plan fetches the rows to modify and their tuple IDs; these are pulled into an outer-most MODIFYTABLE plan node that updates the row to mark it deleted and, if it’s an update, inserts a new version of the row.


The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633

Share this

More Blogs