During my development phase, I encountered an interesting discussion point regarding database table writing logic.
The core database table design I'm working with differs significantly from its visual representation. While view tables appear similar to standard normalized SQL tables, they internally store data using a primary jsonb
column xdo
. This approach, though innovative, presents some noteworthy pros and cons.
One significant consideration is the storage of null values. Visually, null values seem like allocated fields, typical in normalized columns, where storing a null usually consumes just 1 byte. However, when using a single jsonb
column (xdo
) to store data, each null value inherently requires additional storage space due to the necessity of storing the associated key.
For example, if each row averages three null fields with approximately 5-character keys in a table containing 10 million records, the storage overhead per row is about 54 bytes, equating to approximately 540MB for the entire table. With longer keys (e.g., 10 characters), this overhead nearly doubles to around 1GB per table.
When scaling up to tables of 20-30 million records, this overhead quickly escalates to around 2-3GB solely dedicated to storing null values. This not only impacts storage expenses but also increases the size and overhead of associated GIN indexes, potentially affecting query performance.
However, this drawback can be effectively turned into an advantage by proactively stripping out null values before writing them to the database. This strategy completely eliminates the unnecessary keys from the jsonb
column, effectively creating "virtual nulls" in the view tables.
Implementing a simple feature—such as a checkbox or boolean option within Add Record, Add/Edit Record, and Edit Record functionalities—would allow users to enable or disable the native Postgres strip nulls function, easily addressing this issue and optimizing database storage efficiency.
I would love to hear your thoughts on this approach or any experiences you've had dealing with similar storage optimizations.