In this post, I will go over some of the useful operations that can be performed over sqlite files. I always keep forgetting about this, so I decided to write a blog post.
Assumptions: All the below examples assume that we have a sqlite file named social_db.sqlite3
. If you don’t have sqlite3
on your machine, you can get the latest precompiled binary from here. All the examples can be executed interactively as well.
Examining Schemas
Listing all table schemas: .schema
(ins)-> sqlite3 social_db.sqlite3 ".schema"
CREATE TABLE author(
-- maps to row_id https://www.sqlite.org/lang_createtable.html#rowid
id INTEGER NOT NULL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE post(
id INTEGER NOT NULL PRIMARY KEY,
content TEXT NOT NULL,
author_id INTEGER NOT NULL,
-- There is no real datetime in sqlite3. It maps to an underlying storage type
created DATETIME NOT NULL,
updated DATETIME NOT NULL,
FOREIGN KEY(author_id) REFERENCES author(id)
);
CREATE TABLE follower(
id INTEGER NOT NULL PRIMARY KEY,
follower_id INTEGER NOT NULL,
followee_id INTEGER NOT NULL,
FOREIGN KEY(followee_id) REFERENCES author(id)
FOREIGN KEY(follower_id) REFERENCES author(id)
CONSTRAINT "follower and followee can't be same" CHECK (follower_id != followee_id)
);
Examine schema for a particular table: .schema <table>
(ins)-> sqlite3 social_db.sqlite3 '.schema author'
CREATE TABLE author(
-- maps to row_id https://www.sqlite.org/lang_createtable.html#rowid
id INTEGER NOT NULL PRIMARY KEY,
name TEXT NOT NULL
);
List all available tables: .tables
(ins)-> sqlite3 social_db.sqlite3 '.tables'
author follower post
Exporting Data
Exporting as CSV Format: -header -csv
options
(ins)-> sqlite3 social_db.sqlite3 'select id, name from author' -header -csv
id,name
1,-3574307474151495397
2,1567793430626433172
3,-1845670907669369539
If you don’t want a cvs header, then remove the
-header
option.
One could change the column separator using the -separator
option.
(ins)-> sqlite3 social_db.sqlite3 'select id, name from author' -header -csv -separator '|'
id|name
1|-3574307474151495397
2|1567793430626433172
3|-1845670907669369539
Exporting as JSON format: -json
option
(ins)-> sqlite3 social_db.sqlite3 'select * from author' -json
[{"id":1,"name":"-3574307474151495397"},
{"id":2,"name":"1567793430626433172"},
{"id":3,"name":"-1845670907669369539"}]
One other format that I find useful is markdown format. This can be achieved by using the -markdown
option.
(ins)-> sqlite3 social_db.sqlite3 'select * from author' -markdown
| id | name |
|----|----------------------|
| 1 | -3574307474151495397 |
| 2 | 1567793430626433172 |
| 3 | -1845670907669369539 |
For all available formats, refer official documentation
Dump Entire Database Content as SQL
Command: .dump
Examples:
# dumps whole db
(ins)-> sqlite3 social_db.sqlite3 '.dump'
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE author(
-- maps to row_id https://www.sqlite.org/lang_createtable.html#rowid
id INTEGER NOT NULL PRIMARY KEY,
name TEXT NOT NULL
);
INSERT INTO author VALUES(1,'hello1');
INSERT INTO author VALUES(2,'1567793430626433172');
INSERT INTO author VALUES(3,'-1845670907669369539');
INSERT INTO author VALUES(4,'input1');
INSERT INTO author VALUES(5,'input2');
INSERT INTO author VALUES(6,'input3');
CREATE TABLE post(
id INTEGER NOT NULL PRIMARY KEY,
content TEXT NOT NULL,
author_id INTEGER NOT NULL,
-- There is no real datetime in sqlite3. It maps to an underlying storage type
created DATETIME NOT NULL,
updated DATETIME NOT NULL,
FOREIGN KEY(author_id) REFERENCES author(id)
);
INSERT INTO post VALUES(1,'',1,'2023-01-28 18:59:47','2023-01-28 18:59:47');
CREATE TABLE follower(
id INTEGER NOT NULL PRIMARY KEY,
follower_id INTEGER NOT NULL,
followee_id INTEGER NOT NULL,
FOREIGN KEY(followee_id) REFERENCES author(id)
FOREIGN KEY(follower_id) REFERENCES author(id)
CONSTRAINT "follower and followee can't be same" CHECK (follower_id != followee_id)
);
INSERT INTO follower VALUES(1,1,2);
CREATE INDEX author_id_index on post(author_id);
COMMIT;
Be careful as this command can generate huge data and is generally not suitable for large datasets.
One could also select a subset of tables and add filter --data-only
flag to include only data.
# dumps just post table and only includes data
(ins)-> sqlite3 social_db.sqlite3 '.dump post --data-only'
INSERT INTO post VALUES(1,'',1,'2023-01-28 18:59:47','2023-01-28 18:59:47');
Importing Data
Consider loading a file containing a list of users into author
table. For this example, we will make following assumptions:
- The file format is
csv
. At the time of writing,sqlite
can only handle importing a csv like file. If we have any other format likejson
, then it needs to be converted intocsv
. - The csv file doesn’t have
id
column (primary key inauthor
table). This is likely what we want in most cases assqlite
will generate its own primary keys instead of reusing keys from alternate sources. This avoids potential conflicts with existing primary keys1.
SQL to import a file authors_dump.csv
-- Added comments for readability
-- "-v" adds verbosity
-- Format: .import <file_name> <table_name>
.import --csv authors_dump.csv temp_author -v
-- We are loading into temp_author as sqlite currently can't generate pks this way.
insert into author(name) select name from temp_author;
-- delete the temp table
drop table temp_author;
Place the above code in import_csv.sql
file and run via sqlite3
command.
(ins)-> cat import_csv.sql | sqlite3 social_db.sqlite3
CREATE TABLE "temp_author"(
"name" TEXT)
Added 3 rows with 0 errors using 4 lines of input
See how the schema for the temp table
temp_author
is created and logged
Query Planner
Command: EXPLAIN QUERY PLAN <query>
Ex: Consider retrieving the number of posts made a particular author in above schema via the following sql:
select count(*) from post where author_id=1;
Running this query on a large dataset is costly as currently there are no indices. We can see this in action by examining query planner results.
-- Before Index Creation: Indicates that whole post table is scanned
sqlite> EXPLAIN QUERY PLAN select count(*) from post where author_id=1;
QUERY PLAN
`--SCAN post
Lets add an index on author_id
column for post
table and see the query planner results:
-- Add the index
sqlite> create index author_id_index on post(author_id);
-- Run the same query again. See the difference in output.
sqlite> EXPLAIN QUERY PLAN select count(*) from post where author_id=1;
QUERY PLAN
`--SEARCH post USING COVERING INDEX author_id_index (author_id=?)
Getting more details
Command: .eqn full
Example:
sqlite> .eqp full
sqlite> select count(*) from post where author_id=1;
QUERY PLAN
`--SEARCH post USING COVERING INDEX author_id_index (author_id=?)
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 12 0 0 Start at 12
1 Null 0 1 1 0 r[1..1]=NULL
2 OpenRead 1 5 0 k(2,,) 2 root=5 iDb=0; author_id_index
3 Integer 1 2 0 0 r[2]=1
4 SeekGE 1 8 2 1 0 key=r[2]
5 IdxGT 1 8 2 1 0 key=r[2]
6 AggStep 0 0 1 count(0) 0 accum=r[1] step(r[0])
7 Next 1 5 1 0
8 AggFinal 1 0 0 count(0) 0 accum=r[1] N=0
9 Copy 1 3 0 0 r[3]=r[1]
10 ResultRow 3 1 0 0 output=r[3]
11 Halt 0 0 0 0
12 Transaction 0 0 28 0 1 usesStmtJournal=0
13 Goto 0 1 0 0
Checking Rows with Foreign Key Violations
Command: PRAGMA foreign_key_check;
2
If you are wondering why we need to check rows which violate foreign key constraints? Isn’t this the job of sqlite
. At the time of writing, by default, sqlite
doesn’t enable foreign key constraints3. Caller is responsible for enabling this option every time at runtime after opening a new connection.
One can check the current status of foreign key constraints enforcement using PRAGMA foreign_keys;
command. The example below illustrates that enforcement doesn’t persist across connections and is disabled in a new connection.
(ins)-> sqlite3 social_db.sqlite3
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> PRAGMA foreign_keys;
0
--- value is false by default
-- Enable the enforcement
sqlite> PRAGMA foreign_keys = on;
sqlite> PRAGMA foreign_keys;
1
sqlite> .quit
-- New connection still has default value.
(ins)-> sqlite3 social_db.sqlite3
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> PRAGMA foreign_keys;
0
Always a good practice to enable this setting in every new connection.
If enforcement is disabled, then we can add rows which violate foreign key constraints. Consider an example of adding invalid follower_id
and followee_id
in follower
table:
-- enforcment is off
sqlite> PRAGMA foreign_keys;
0
-- max author id value present.
sqlite> select max(id) from author;
14
-- 100 and 200 ids for author are invalid. But row is successfully added
sqlite> insert into follower(followee_id, follower_id) VALUES(100, 200);
sqlite> select * from follower where followee_id=100 and follower_id=200;
16|200|100
Now, let’s enable foreign key constraints using PRAGMA foreign_keys=on;
command and attempt to add a invalid row. This time insertion fails as expected.
-- Enables foreign key checking
sqlite> PRAGMA foreign_keys=on;
sqlite> insert into follower(followee_id, follower_id) VALUES(100, 300);
Runtime error: FOREIGN KEY constraint failed (19)
However, previous invalid rows added still exist in the table. One could list all foreign key violations for all tables by running PRAGMA foreign_key_check;
command. This information can be used to fix the rows.
-- Enabled headers so that output columns show up
sqlite> .header on
sqlite> PRAGMA foreign_key_check;
table|rowid|parent|fkid
follower|1|author|0
follower|1|author|1
Format of the row:
<table_name>|<row_id>|<foreign_key_table>|<index_of_foreign_key_constraint>
Enabling Foreign key constraints automatically when sqlite
is invoked
If you are like me, enabling PRAGMA foreign_keys=on;
everytime you invoke sqlite3
from the command line is annoying and you will forget this most of the time. We can instead the initialization file .sqliterc
(placed in home directory) that is read by sqlite3
to configure the settings.
PRAGMA foreign_keys=on;
.print "status of foreign key constraints:"
PRAGMA foreign_keys;
{: file="~/.sqliterc" }
We can run sqlite3
to verify that the new configuration works as expected.
(ins)-> sqlite3 social_db.sqlite3
-- Loading resources from /Users/demo/.sqliterc
status of foreign key constraints:
1
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
-- Insertion of invalid row failed as expected.
sqlite> insert into follower(followee_id, follower_id) VALUES(100, 300);
Runtime error: FOREIGN KEY constraint failed (19)
-
If you want to replicate an external source completely, then you can keep the
id
column. Also, in both cases, you need to handle duplicates properly. ↩︎ -
One could check for a particular table or a scheme as well. For more information, refer to official docs. ↩︎
-
I believe this is due to legacy and backwards-compatibility reasons. It could very well be possible that this may be changed in the future. ↩︎