postgresql/src/tools/pgindent
Andres Freund 5dfc198146 Use more efficient hashtable for execGrouping.c to speed up hash aggregation.
The more efficient hashtable speeds up hash-aggregations with more than
a few hundred groups significantly. Improvements of over 120% have been
measured.

Due to the the different hash table queries that not fully
determined (e.g. GROUP BY without ORDER BY) may change their result
order.

The conversion is largely straight-forward, except that, due to the
static element types of simplehash.h type hashes, the additional data
some users store in elements (e.g. the per-group working data for hash
aggregaters) is now stored in TupleHashEntryData->additional.  The
meaning of BuildTupleHashTable's entrysize (renamed to additionalsize)
has been changed to only be about the additionally stored size.  That
size is only used for the initial sizing of the hash-table.

Reviewed-By: Tomas Vondra
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 17:22:51 -07:00
..
exclude_file_patterns Simplify the process of perltidy'ing our Perl files. 2016-08-15 11:32:09 -04:00
indent.bsd.patch Fix pg_bsd_indent bug where newlines were not being trimmed from typedef 2011-10-26 17:24:19 -04:00
perltidyrc Remove whitespace from end of lines 2013-05-30 21:05:07 -04:00
pgcppindent Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
pgindent Fix more typos in comments. 2015-05-20 19:45:43 +03:00
pgindent.man Fix whitespace issues found by git diff --check, add gitattributes 2013-11-10 14:48:29 -05:00
pgperltidy Simplify the process of perltidy'ing our Perl files. 2016-08-15 11:32:09 -04:00
README Simplify the process of perltidy'ing our Perl files. 2016-08-15 11:32:09 -04:00
typedefs.list Use more efficient hashtable for execGrouping.c to speed up hash aggregation. 2016-10-14 17:22:51 -07:00

pgindent'ing the PostgreSQL source tree
=======================================

We run this process at least once in each development cycle,
to maintain uniform layout style in our C and Perl code.

You might find this blog post interesting:
http://adpgtech.blogspot.com/2015/05/running-pgindent-on-non-core-code-or.html


PREREQUISITES:

1) Install pg_bsd_indent in your PATH (see below for details).

2) Install entab (src/tools/entab/).

3) Install perltidy.  Please be sure it is v20090616 (older and newer
   versions make different formatting choices, and we want consistency).

DOING THE INDENT RUN:

1) Change directory to the top of the source tree.

2) Remove all derived files (pgindent has trouble with flex files, and it
   would be pointless to run it on them anyway):

	make maintainer-clean
   Or:
	git clean -fdx

3) Download the latest typedef file from the buildfarm:

	wget -O src/tools/pgindent/typedefs.list http://buildfarm.postgresql.org/cgi-bin/typedefs.pl

   (See http://www.pgbuildfarm.org/cgi-bin/typedefs.pl?show_list for a full
   list of typedef files, if you want to indent some back branch.)

4) Run pgindent on the C files:

	src/tools/pgindent/pgindent

   If any files generate errors, restore their original versions with
   "git checkout", and see below for cleanup ideas.

5) Indent the Perl code using perltidy:

	src/tools/pgindent/pgperltidy

   If you want to use some perltidy version that's not in your PATH,
   first set the PERLTIDY environment variable to point to it.

VALIDATION:

1) Check for any newly-created files using "git status"; there shouldn't
   be any.  (perltidy tends to leave *.LOG files behind if it has trouble.)

2) Do a full test build:

	./configure ...
	make -s all	# look for unexpected warnings, and errors of course
	make check-world

   Your configure switches should include at least --enable-tap-tests
   or else much of the Perl code won't get exercised.

3) If you have the patience, it's worth eyeballing the "git diff" output
   for any egregiously ugly changes.  See below for cleanup ideas.


When you're done, "git commit" everything including the typedefs.list file
you used.


---------------------------------------------------------------------------

Cleaning up in case of failure or ugly output
---------------------------------------------

If you don't like the results for any particular file, "git checkout"
that file to undo the changes, patch the file as needed, then repeat
the indent process.

pgindent will reflow any comment block that's not at the left margin.
If this messes up manual formatting that ought to be preserved, protect
the comment block with some dashes:

	/*----------
	 * Text here will not be touched by pgindent.
	 *----------
	 */

Odd spacing around typedef names might indicate an incomplete typedefs list.

pgindent can get confused by #if sequences that look correct to the compiler
but have mismatched braces/parentheses when considered as a whole.  Usually
that looks pretty unreadable to humans too, so best practice is to rearrange
the #if tests to avoid it.

Sometimes, if pgindent or perltidy produces odd-looking output, it's because
of minor bugs like extra commas.  Don't hesitate to clean that up while
you're at it.

---------------------------------------------------------------------------

BSD indent
----------

We have standardized on NetBSD's indent, and renamed it pg_bsd_indent.
We have fixed a few bugs which requre the NetBSD source to be patched
with indent.bsd.patch patch.  A fully patched version is available at
ftp://ftp.postgresql.org/pub/dev.

GNU indent, version 2.2.6, has several problems, and is not recommended.
These bugs become pretty major when you are doing >500k lines of code.
If you don't believe me, take a directory and make a copy.  Run pgindent
on the copy using GNU indent, and do a diff -r. You will see what I
mean. GNU indent does some things better, but mangles too.  For details,
see:

	http://archives.postgresql.org/pgsql-hackers/2003-10/msg00374.php
	http://archives.postgresql.org/pgsql-hackers/2011-04/msg01436.php

---------------------------------------------------------------------------

Which files are processed
-------------------------

The pgindent run processes (nearly) all PostgreSQL *.c and *.h files,
but we currently exclude *.y and *.l files.  Exceptions are listed
in exclude_file_patterns:

src/include/storage/s_lock.h and src/include/port/atomics/ are excluded
because they contain assembly code that pgindent tends to mess up.

src/interfaces/ecpg/test/expected/ is excluded to avoid breaking the ecpg
regression tests.  Several *.h files are included in regression output so
should not be changed.

src/include/snowball/libstemmer/ and src/backend/snowball/libstemmer/
are excluded because those files are imported from an external project,
not maintained locally, and are machine-generated anyway.  Likewise for
plperl/ppport.h.

The perltidy run processes all *.pl and *.pm files, plus a few
executable Perl scripts that are not named that way.  See the "find"
rules in pgperltidy for details.