pg_dump and pg_restore are Slow? Try this.

A friend was trying to import about 300K rows that were dumped with pg_dump. Trouble was, it would barf saying “Invalid Command \N” and never work. The attempted work-around involved telling pg_dump to use discrete INSERT statements per row, but this made it so slow it would have taken nearly 3 days, we guessed.

The actual error was above the Invalid Command messages, but scrolled away. Was able to see it like this:

[code]
[[email protected]]$ psql < import_me.sql 2>&1 | head -n 20
[/code]

Not sure if I needed the STDERR redirect there, but it seemed sensible and worked so I didn’t investigate further!

We ended up fixing the actual problem causing the Invalid Command messages, which involved pre-creating various bits of schema, relations, roles, etc., and using a data-only dump during The Big Import. The error was kicking the COPY FROM STDIN command to malfunction, and it tried to interpret the rows as SQL statements, which they are not.

Moral of the story, use COPY FROM and not INSERTs. There are plenty of other optimizations, turn off indexes and add them back at the end, and make sure your pg tuning is in order.

 

Aaron K.

Aaron Kondziela is a technologist and serial entrepreneur living in New York City.

 

Leave a Reply

Your email address will not be published. Required fields are marked *