PostgreSQL (postgres or pgsql) is a powerful open source relational database known for reliability, extensibility, and standards compliance. It features:
- Advanced SQL support (window functions, CTEs, JSON, full-text search)
- ACID compliance and strong transactional integrity
- Rich indexing (B-tree, GIN, GiST, BRIN, hash, SP-GiST)
- Extensible with custom types, operators, and functions
- MVCC for high concurrency and performance
- Robust security, authentication, and role management
- Active community, frequent releases, and excellent documentation
Ideal for everything from small apps to large-scale, mission-critical systems.
Core Concepts
Configuration
Depends on distro, generally somewhere like /etc/postgresql/11/main.
postgresql.confthe main config file, by default psql runs on port 5432, and the data directory lives invarsomewhere like/var/lib/postgresql/11/mainpg_hba.confhost based authenticationpg_ident.confident authentication config
Roles and Security
Postgres simplifies users and groups, into the role (i.e. they are users, they are roles).
A number of authentication methods are available, with the default being ident.
identandpeer: if a local nix account that has the same name as a psql role, can sign in as that role. Optional user to role mappings can go inpg_ident.conf.identapplies to TCP/IP, andpeerfor local connections.md5: takes an MD5 hashed passwordpassword: clear texttrust: no password needed, based on an IP/user/DB matchreject: deny immediatelyGSSAPI: standard secure auth protocol defined in RFC2743, and often combined with KerberosSSPI: for Microsoft environments, uses Kerberos when possible, will fall back to NTLM- and lots more
kerberos,LDAP,RADIUS,CertificateandPAM
By default, only the postgres user has read/write access to the configuration files, to edit make sure you are running as this identity.
To allow a non-default login, open up the the pg_hba.conf (host based auth) config file.
$ sudo -u postgres vim pg_hba.conf
Basics
Connectivity
Starting off with ident authentication, by default on installation a new OS user and PG role called postgres is created. Running the psql client as this user is one way to gain initial access, i.e. yb sudo -i -u postgres or by running psql under the user sudo -u postgres psql.
postgres=# SELECT version();
version
-----------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 11.1 (Ubuntu 11.1-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0, 64-bit
(1 row)
Enabling remote access, is generally a first desirable goal.
- Edit
postgresql.conf, uncommentlisten_addresses = 'localhost', changelocalhostto* - Edit
pg_hba.conf, addhost all all 192.168.1.0/24 trust(substituting the network and CIDR for whats appropriate) - Bounce the daemon
sudo systemctl restart postgresql - From a remote (client) host verify the port is open with a port scan
nmap -sT 192.168.1.238
$ nmap -sT 192.168.1.238
Nmap scan report for 192.168.1.238
PORT STATE SERVICE
22/tcp open ssh
5432/tcp open postgresql
Client Utilities
PostgreSQL ships with several handy client applications. You can see the symlinks to them in /usr/bin by running ls -l /usr/bin | grep postgres.
clusterdbcluster a databasecreatedbcreate a new databasecreatelanginstall a procedural languagecreateuserdefine a new user accountdropdbremove a databasedroplangremove a procedural languagedropuserremove a user accountpg_basebackuptake a base backup of a clusterpgbenchrun a benchmark testpg_dumpextract a database into a script file or other archive filepg_dumpallextract a database cluster into a script filepg_isreadycheck the connection status of a serverpg_receivewalstream write-ahead logs from a serverpg_receivexlogstreams transaction logs from a clusterpg_recvlogicalcontrol logical decoding streamspg_restorerestore a database from an archive file created bypg_dumppsqlinteractive terminalreindexdbreindex a databasevacuumdbgarbage-collect and analyze a database
For example to create a user and database, could run:
$ sudo -u postgres createuser --login ben
And:
$ sudo -u postgres createdb --owner=ben foodb
To make the ben role a superuser could have passed the -s switch to createuser, to promote the existing low powered role, can leverage the ALTER USER SQL command. First can review existing role permissions with a quick \du:
$ sudo -u postgres psql
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+-----------
ben | | {}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
Then the SQL:
postgres=# ALTER USER ben WITH SUPERUSER CREATEDB CREATEROLE REPLICATION LOGIN;
ALTER ROLE
Verification:
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+-----------
ben | Superuser, Create role, Create DB, Replication | {}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
Several privileges are available and well documented:
SUPERUSERcan override any access controls within the database. Dangerous.CREATEDBcan spin up new dbsCREATEROLEcan create/drop roles (users)INHERITautomatically applies powers of roles that the role is a member of (without explicitSET ROLE)LOGINcan be used for initial session authorisation (i.e. a user)REPLICATIONneeded for connectivity to servers in replication mode.BYPASSRLSbypass row-level security (RLS) policy.
Each of the below has an explicit deny version of the privilege, prefixed with NO, e.g. NOSUPERUSER.
psql Primer
\?: show all psql commands\hshow SQL syntax help\cdbname [username]: Connect to database, with an optional username (or\connect)
Display commands (append + for extra details):
\l: List all database (or\list)\d: Display all tables, indexes, views, and sequences\dt: Display all tables\di: Display all indexes\dv: Display all views\ds: Display all sequences\dT: Display all types\dS: Display all system tables\du: Display all users\xauto|on|off: Toggle|On|Off expanded output mode
To change database use \c:
\c enceladus
enceladus=#
To get specific SQL statement help, try handy psql command \h:
enceladus=# \h CREATE TABLE
Command: CREATE TABLE
Description: define a new table
Syntax:
CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name ( [
{ column_name data_type [ COLLATE collation ] [ column_constraint [ ... ] ]
| table_constraint
....
Go for it:
enceladus=# drop table if exists master_plan;
enceladus=# create table master_plan(
enceladus(# id serial primary key,
enceladus(# the_date date,
enceladus(# title varchar(100),
enceladus(# description text
enceladus(# );
SQL script files
Great for reusable SQL, and putting into Git.
psql foodb -h 10.1.2.3 < build.sql
Or:
psql foodb -h 10.1.2.3 -f build.sql
Importing Data (ETL)
Generally ETL software is bloat. KISS with:
- Idempotent shell scripts and
makefiles. Idempotency is important, so that scripts can be reliably run numerous times and result in the same outcome. All data should be typed asTEXT, typing and cleanup can be dealt with later on. - If needed, elevate to Python using Pandas or PETL
- If needed, elevate by considering a dedicating piece of middleware such as Kafka or Logstash
Digging through the help \h COPY found COPY FROM will be useful for extracting the CSV on disk:
copy data between a file and a table
For example:
COPY master_plan
FROM '<put_path_here>/master_plan.csv'
WITH DELIMITER ',' HEADER CSV;
Note, the SQL based approach is server based, i.e. the csv path will need to be accessible by the server itself.
For a client side approach can leverage the \copy commands, like so:
\copy master_plan FROM '/home/ben/code/psql/cassini_data/curious_data/data/master_plan.csv' WITH DELIMITER ',' HEADER CSV;
JSON and JSONB
Key Benefits for JSONB
-- Fast containment queries - find events containing specific data:
SELECT * FROM events WHERE event_data @> '{"user_id": 123}';
SELECT * FROM events WHERE event_data @> '{"type": "OrderCreated"}';
-- Existence checks - check if JSON key exists
SELECT * FROM events WHERE event_data ? 'correlation_id';
SELECT * FROM events WHERE event_data ?& array['user_id', 'amount'];
-- JSON path queries - query nested JSON structures:
SELECT * FROM events WHERE event_data @@ '$.metadata.version > 2';
Event sourcing use-cases (one of my favourite architectural patterns):
-- Event Sourcing Use Cases - Aggregate filtering:
WHERE event_data @> '{"aggregate_id": "order-123"}'
-- Event type filtering - all payment events:
WHERE event_data @> '{"event_type": "PaymentProcessed"}'
-- Complex event queries - Events with specific metadata:
WHERE event_data @> '{"metadata": {"source": "api"}}'
Indexing
GIN (Generalized Inverted Index)
GIN (Generalized Inverted Index) provides fast querying capabilities for JSONB data in PostgreSQL. GIN essentially makes JSONB queries perform like traditional indexed columns, which is crucial for use-cases such as event stores with millions of fine grained events.
Uber handy
Number of active connections
select pid from pg_stat_activity where datname = 'svc';
Kill connections
select pg_terminate_backend(pid) from pg_stat_activity where pid <> pg_backend_pid() and datname = 'database_name';