Final doc tweaks before 4.1

This commit is contained in:
Daniel J. Summers 2025-04-19 15:47:42 -04:00
parent f80914daef
commit 918cb384d4
10 changed files with 30 additions and 253 deletions

View File

@ -22,7 +22,10 @@
"content": [
{
"files": [
"**/*.{md,yml}"
"index.md",
"toc.yml",
"api/**/*.{md,yml}",
"docs/**/*.{md,yml}"
],
"exclude": [
"_site/**"
@ -32,7 +35,8 @@
"resource": [
{
"files": [
"bitbadger-doc.png"
"bitbadger-doc.png",
"favicon.ico"
]
}
],
@ -46,6 +50,7 @@
"_appName": "BitBadger.Documents",
"_appTitle": "BitBadger.Documents",
"_appLogoPath": "bitbadger-doc.png",
"_appFaviconPath": "favicon.ico",
"_appFooter": "Hand-crafted documentation created with <a href=https://dotnet.github.io/docfx target=_blank class=external>docfx</a> by <a href=https://bitbadger.solutions target=_blank class=external>Bit Badger Solutions</a>",
"_enableSearch": true,
"pdf": false

View File

@ -7,10 +7,10 @@ While the functions provided by the library cover lots of use cases, there are o
- [Customizing Serialization][ser]
- [Related Documents and Custom Queries][rel]
- [Transactions][txn]
- [Referential Integrity][ref] (PostgreSQL only)
- [Referential Integrity with Documents][ref] (PostgreSQL only; conceptual)
[ser]: ./custom-serialization.md "Advanced Usage: Custom Serialization • BitBadger.Documents"
[rel]: ./related.md "Advanced Usage: Related Documents • BitBadger.Documents"
[txn]: ./transactions.md "Advanced Usage: Transactions • BitBadger.Documents"
[ref]: ./integrity.md "Advanced Usage: Referential Integrity • BitBadger.Documents"
[ref]: /concepts/referential-integrity.html "Appendix: Referential Integrity with Documents &bull; Concepts &bull; Relationanl Documents"

View File

@ -1,222 +0,0 @@
# Referential Integrity
_<small>Documentation pages for `BitBadger.Npgsql.Documents` redirect here. This library replaced it as of v3; see project home if this applies to you.</small>_
One of the hallmarks of document database is loose association between documents. In our running hotel and room example, there is no technical reason we could not delete every hotel in the database, leaving all the rooms with hotel IDs that no longer exist. This is a feature-not-a-bug, but it shows the tradeoffs inherent to selecting a data storage mechanism. In our case, this is less than ideal - but, since we are using PostgreSQL, a relational database, we can implement referential integrity if, when, and where we need it.
> _NOTE: This page has very little to do with the document library itself; these are all modifications that can be made via PostgreSQL. SQLite may have similar capabilities, but this author has yet to explore that._
## Enforcing Referential Integrity on the Child Document
While we've been able to use `data->>'Id'` in place of column names for most things up to this point, here is where we hit a roadblock; we cannot define a foreign key constraint against an arbitrary expression. Through database triggers, though, we can accomplish the same thing.
Triggers are implemented in PostgreSQL through a function/trigger definition pair. A function defined as a trigger has `NEW` and `OLD` defined as the data that is being manipulated (different ones, depending on the operation; no `OLD` for `INSERT`s, no `NEW` for `DELETE`s, etc.). For our purposes here, we'll use `NEW`, as we're trying to verify the data as it's being inserted or updated.
```sql
CREATE OR REPLACE FUNCTION room_hotel_id_fk() RETURNS TRIGGER AS $$
DECLARE
hotel_id TEXT;
BEGIN
SELECT data->>'Id' INTO hotel_id FROM hotel WHERE data->>'Id' = NEW.data->>'HotelId';
IF hotel_id IS NULL THEN
RAISE EXCEPTION 'Hotel ID % does not exist', NEW.data->>'HotelId';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER hotel_enforce_fk BEFORE INSERT OR UPDATE ON room
FOR EACH ROW EXECUTE FUNCTION room_hotel_id_fk();
```
This is as straightforward as we can make it; if the query fails to retrieve data (returning `NULL` here, not raising `NO_DATA_FOUND` like Oracle would), we raise an exception. Here's what that looks like in practice.
```
hotel=# insert into room values ('{"Id": "one", "HotelId": "fifteen"}');
ERROR: Hotel ID fifteen does not exist
CONTEXT: PL/pgSQL function room_hotel_id_fk() line 7 at RAISE
hotel=# insert into hotel values ('{"Id": "fifteen", "Name": "Demo Hotel"}');
INSERT 0 1
hotel=# insert into room values ('{"Id": "one", "HotelId": "fifteen"}');
INSERT 0 1
```
(This assumes we'll always have a `HotelId` field; [see below][] on how to create this trigger if the foreign key is optional.)
## Enforcing Referential Integrity on the Parent Document
We've only addressed half of the parent/child relationship so far; now, we need to make sure parents don't disappear.
### Referencing the Child Key
The trigger on `room` referenced the unique index in its lookup. When we try to go from `hotel` to `room`, though, we'll need to address the `HotelId` field of the `room`' document. For the best efficiency, we can index that field.
```sql
CREATE INDEX IF NOT EXISTS idx_room_hotel_id ON room ((data->>'HotelId'));
```
### `ON DELETE DO NOTHING`
When defining a foreign key constraint, the final part of that clause is an `ON DELETE` action; if it's excluded, it defaults to `DO NOTHING`. The effect of this is that rows cannot be deleted if they are referenced in a child table. This can be implemented by looking for any rows that reference the hotel being deleted, and raising an exception if any are found.
```sql
CREATE OR REPLACE FUNCTION hotel_room_delete_prevent() RETURNS TRIGGER AS $$
DECLARE
has_rows BOOL;
BEGIN
SELECT EXISTS(SELECT 1 FROM room WHERE OLD.data->>'Id' = data->>'HotelId') INTO has_rows;
IF has_rows THEN
RAISE EXCEPTION 'Hotel ID % has dependent rooms; cannot delete', OLD.data->>'Id';
END IF;
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER hotel_room_delete BEFORE DELETE ON hotel
FOR EACH ROW EXECUTE FUNCTION hotel_room_delete_prevent();
```
This trigger in action...
```
hotel=# delete from hotel where data->>'Id' = 'fifteen';
ERROR: Hotel ID fifteen has dependent rooms; cannot delete
CONTEXT: PL/pgSQL function hotel_room_delete_prevent() line 7 at RAISE
hotel=# select * from room;
data
-------------------------------------
{"Id": "one", "HotelId": "fifteen"}
(1 row)
```
There's that child record! We've successfully prevented an orphaned room.
### `ON DELETE CASCADE`
Rather than prevent deletion, another foreign key constraint option is to delete the dependent records as well; the delete "cascades" (like a waterfall) to the child tables. Implementing this is even less code!
```sql
CREATE OR REPLACE FUNCTION hotel_room_delete_cascade() RETURNS TRIGGER AS $$
BEGIN
DELETE FROM room WHERE data->>'HotelId' = OLD.data->>'Id';
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER hotel_room_delete BEFORE DELETE ON hotel
FOR EACH ROW EXECUTE FUNCTION hotel_room_delete_cascade();
```
Here's is what happens when we try the same `DELETE` statement that was prevented above...
```
hotel=# select * from room;
data
-------------------------------------
{"Id": "one", "HotelId": "fifteen"}
(1 row)
hotel=# delete from hotel where data->>'Id' = 'fifteen';
DELETE 1
hotel=# select * from room;
data
------
(0 rows)
```
We deleted a hotel, not rooms, but the rooms are now gone as well.
### `ON DELETE SET NULL`
The final option for a foreign key constraint is to set the column in the dependent table to `NULL`. There are two options to set a field to `NULL` in a `JSONB` document; we can either explicitly give the field a value of `null`, or we can remove the field from the document. As there is no schema, the latter is cleaner; PostgreSQL will return `NULL` for any non-existent field.
```sql
CREATE OR REPLACE FUNCTION hotel_room_delete_set_null() RETURNS TRIGGER AS $$
BEGIN
UPDATE room SET data = data - 'HotelId' WHERE data->>'HotelId' = OLD.data ->> 'Id';
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER hotel_room_delete BEFORE DELETE ON hotel
FOR EACH ROW EXECUTE FUNCTION hotel_room_delete_set_null();
```
That `-` operator is new for us. When used on a `JSON` or `JSONB` field, it removes the named field from the document.
Let's watch it work...
```
hotel=# delete from hotel where data->>'Id' = 'fifteen';
ERROR: Hotel ID <NULL> does not exist
CONTEXT: PL/pgSQL function room_hotel_id_fk() line 7 at RAISE
SQL statement "UPDATE room SET data = data - 'HotelId' WHERE data->>'HotelId' = OLD.data->>'Id'"
PL/pgSQL function hotel_room_delete_set_null() line 3 at SQL statement
```
Oops! This trigger execution fired the `BEFORE UPDATE` trigger on `room`, and it took exception to us setting that value to `NULL`. The child table trigger assumes we'll always have a value. We'll need to tweak that trigger to allow this.
```sql
CREATE OR REPLACE FUNCTION room_hotel_id_nullable_fk() RETURNS TRIGGER AS $$
DECLARE
hotel_id TEXT;
BEGIN
IF NEW.data->>'HotelId' IS NOT NULL THEN
SELECT data->>'Id' INTO hotel_id FROM hotel WHERE data->>'Id' = NEW.data->>'HotelId';
IF hotel_id IS NULL THEN
RAISE EXCEPTION 'Hotel ID % does not exist', NEW.data->>'HotelId';
END IF;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER hotel_enforce_fk BEFORE INSERT OR UPDATE ON room
FOR EACH ROW EXECUTE FUNCTION room_hotel_id_nullable_fk();
```
Now, when we try to run the deletion, it works.
```
hotel=# select * from room;
data
-------------------------------------
{"Id": "one", "HotelId": "fifteen"}
(1 row)
hotel=# delete from hotel where data->>'Id' = 'fifteen';
DELETE 1
hotel=# select * from room;
data
---------------
{"Id": "one"}
(1 row)
```
## Should We Do This?
You may be thinking "Hey, this is pretty cool; why not do this everywhere?" Well, the answer is - as it is with _everything_ software-development-related - "it depends."
### No...?
The flexible, schemaless data storage paradigm that we call "document databases" allow changes to happen quickly. While "schemaless" can mean "ad hoc," in practice most documents have a well-defined structure. Not having to define columns for each item, then re-define or migrate them when things change, brings a lot of benefits.
What we've implemented above, in this example, complicates some processes. Sure, triggers can be disabled then re-enabled, but unlike true constraints, they do not validate existing data. If we were to disable triggers, run some updates, and re-enable them, we could end up with records that can't be saved in their current state.
### Yes...?
The lack of referential integrity in document databases can be an impediment to adoption in areas where that paradigm may be more suitable than a relational one. To be sure, there are fewer relationships in a document database whose documents have complex structures, arrays, etc. This doesn't mean that there won't be relationships, though; in our hotel example, we could easily see a "reservation" document that has the IDs of a customer and a room. Just as it didn't make much sense to embed the rooms in a hotel document, it doesn't make sense to embed customers in a room document.
What PostgreSQL brings to all of this is that it does not have to be an all-or-nothing decision re: referential integrity. We can implement a document store with no constraints, then apply the ones we absolutely must have. We realize we're complicating maintenance a bit (though `pgdump` will create a backup with the proper order for restoration), but we like that PostgreSQL will protect us from broken code or mistyped `UPDATE` statements.
## Going Further
As the trigger functions are executing SQL, it would be possible to create a set of reusable trigger functions that take table/column as parameters. Dynamic SQL in PL/pgSQL was additional complexity that would have distracted from the concepts. Feel free to take the examples above and make them reusable.
Finally, one piece we will not cover is `CHECK` constraints. These can be applied to tables using the `data->>'Key'` syntax, and can be used to apply more of a schema feel to the unstructured `JSONB` document. PostgreSQL's handling of JSON data really is first-class and unopinionated; you can use as much or as little as you like!
[« Return to "Advanced Usage" for `PDODocument`][adv-pdo]
[see below]: #on-delete-set-null
[adv-pdo]: https://bitbadger.solutions/open-source/relational-documents/php/advanced-usage.html "Advanced Usage • PDODocument • Bit Badger Solutions"

View File

@ -11,8 +11,6 @@
href: advanced/related.md
- name: Transactions
href: advanced/transactions.md
- name: Referential Integrity
href: advanced/integrity.md
- name: Upgrading
items:
- name: v3 to v4

BIN
favicon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.3 KiB

View File

@ -1,28 +1,24 @@
---
_layout: landing
title: Welcome!
---
BitBadger.Documents provides a lightweight document-style interface over [PostgreSQL][]'s and [SQLite][]'s JSON storage capabilities, with first-class support for both C# and F# programs. _(It is developed by the community; it is not officially affiliated with either project.)_
> NOTE: v4.1 is the latest version. See below for upgrading.
> [!TIP]
> Expecting `BitBadger.Npgsql.Documents`? This library replaced it as of v3.
## Installing
### PostgreSQL
### PostgreSQL [![Nuget (with prereleases)][pkg-shield-pgsql]][pkg-link-pgsql]
[![Nuget (with prereleases)][pkg-shield-pgsql]][pkg-link-pgsql]
```
```shell
dotnet add package BitBadger.Documents.Postgres
```
### SQLite
### SQLite [![Nuget (with prereleases)][pkg-shield-sqlite]][pkg-link-sqlite]
[![Nuget (with prereleases)][pkg-shield-sqlite]][pkg-link-sqlite]
```
```shell
dotnet add package BitBadger.Documents.Sqlite
```
@ -40,7 +36,7 @@ dotnet add package BitBadger.Documents.Sqlite
## Why Documents?
Document databases usually store <abbr title="JavaScript Object Notation">JSON</abbr> objects (as their "documents") to provide a schemaless persistence of data; they also provide fault-tolerant ways to query that possibly-unstructured data. [MongoDB][] was the pioneer and is the leader in this space, but there are several who provide their own take on it, and their own programming <abbr title="Application Programming Interface">API</abbr> to come along with it. They also usually have some sort of clustering, replication, and sharding solution that allows them to be scaled out (horizontally) to handle a large amount of traffic.
Document databases usually store <abbr title="JavaScript Object Notation">JSON</abbr> objects (as their "documents") to provide schemaless persistence of data; they also provide fault-tolerant ways to query that possibly-unstructured data. [MongoDB][] was the pioneer and is the leader in this space, but there are several who provide their own take on it, and their own programming <abbr title="Application Programming Interface">API</abbr> to come along with it. They also usually have some sort of clustering, replication, and sharding solution that allows them to be scaled out (horizontally) to handle a large amount of traffic.
As a mature relational database, PostgreSQL has a long history of robust data access from the .NET environment; Npgsql is actively developed, and provides both ADO.NET and <abbr title="Entity Framework">EF</abbr> Core APIs. PostgreSQL also has well-established, battle-tested horizontal scaling options. Additionally, the [Npgsql.FSharp][] project provides a functional API over Npgsql's ADO.NET data access. These three factors make PostgreSQL an excellent choice for document storage, and its relational nature can help in areas where traditional document databases become more complex.
@ -67,7 +63,7 @@ PostgreSQL is the most popular non-WordPress database for good reason.
The [SQLite "About" page][sqlite-about] has a short description of the project and its strengths. Simplicity, flexibility, and a large install base speak for themselves. A lot of people believe they will need a lot of features offered by server-based relational databases, and live with that complexity even when the project is small. A smarter move may be to build with SQLite; if the need arises for something more, the project is very likely a success!
Many of the benefits listed for PostgreSQL apply here as well, including its test coverage - but SQLite removes the requirement to run it as a server!
Many of the benefits listed for PostgreSQL apply here as well, including its test coverage, but SQLite removes the requirement to run it as a server!
## Support
@ -94,4 +90,4 @@ Issues can be filed on the project's GitHub repository.
[Litestream]: https://litestream.io/ "Litestream"
[sqlite-about]: https://sqlite.org/about.html "About • SQLite"
[json-ops]: https://www.postgresql.org/docs/15/functions-json.html#FUNCTIONS-JSON-OP-TABLE "JSON Functions and Operators • Documentation • PostgreSQL"
[tests]: https://github.com/bit-badger/BitBadger.Documents/actions/workflows/ci.yml "Actions • BitBadger.Documents • GitHub"
[tests]: https://git.bitbadger.solutions/bit-badger/BitBadger.Documents/releases "Releases • BitBadger.Documents • Bit Badger Solutions Git"

View File

@ -8,11 +8,11 @@ This package provides common definitions and functionality for `BitBadger.Docume
- Select, insert, update, save (upsert), delete, count, and check existence of documents, and create tables and indexes for these documents
- Automatically generate IDs for documents (numeric IDs, GUIDs, or random strings)
- Addresses documents via ID and via comparison on any field (for PostgreSQL, also via equality on any property by using JSON containment, or via condition on any property using JSON Path queries)
- Accesses documents as your domain models (<abbr title="Plain Old CLR Objects">POCO</abbr>s)
- Uses `Task`-based async for all data access functions
- Uses building blocks for more complex queries
- Address documents via ID and via comparison on any field (for PostgreSQL, also via equality on any property by using JSON containment, or via condition on any property using JSON Path queries)
- Access documents as your domain models (<abbr title="Plain Old CLR Objects">POCO</abbr>s), as JSON strings, or as JSON written directly to a `PipeWriter`
- Use `Task`-based async for all data access functions
- Use building blocks for more complex queries
## Getting Started
Install the library of your choice and follow its README; also, the [project site](https://bitbadger.solutions/open-source/relational-documents/) has complete documentation.
Install the library of your choice and follow its README; also, the [project site](https://relationaldocs.bitbadger.solutions/dotnet/) has complete documentation.

View File

@ -6,12 +6,12 @@
<AssemblyVersion>4.1.0.0</AssemblyVersion>
<FileVersion>4.1.0.0</FileVersion>
<VersionPrefix>4.1.0</VersionPrefix>
<PackageReleaseNotes>Add JSON retrieval and stream-writing functions</PackageReleaseNotes>
<PackageReleaseNotes>Add JSON retrieval and pipe-writing functions; update project URL to site with public API docs</PackageReleaseNotes>
<Authors>danieljsummers</Authors>
<Company>Bit Badger Solutions</Company>
<PackageReadmeFile>README.md</PackageReadmeFile>
<PackageIcon>icon.png</PackageIcon>
<PackageProjectUrl>https://bitbadger.solutions/open-source/relational-documents/</PackageProjectUrl>
<PackageProjectUrl>https://relationaldocs.bitbadger.solutions/dotnet/</PackageProjectUrl>
<PackageRequireLicenseAcceptance>false</PackageRequireLicenseAcceptance>
<RepositoryUrl>https://git.bitbadger.solutions/bit-badger/BitBadger.Documents</RepositoryUrl>
<RepositoryType>Git</RepositoryType>

View File

@ -13,7 +13,7 @@ This package provides a lightweight document library backed by [PostgreSQL](http
## Upgrading from v3
There is a breaking API change for `ByField` (C#) / `byField` (F#), along with a compatibility namespace that can mitigate the impact of these changes. See [the migration guide](https://bitbadger.solutions/open-source/relational-documents/upgrade-from-v3-to-v4.html) for full details.
There is a breaking API change for `ByField` (C#) / `byField` (F#), along with a compatibility namespace that can mitigate the impact of these changes. See [the migration guide](https://relationaldocs.bitbadger.solutions/dotnet/upgrade/v4.html) for full details.
## Getting Started
@ -71,7 +71,7 @@ var customer = await Find.ById<string, Customer>("customer", "123");
// Find.byId type signature is string -> 'TKey -> Task<'TDoc option>
let! customer = Find.byId<string, Customer> "customer" "123"
```
_(keys are treated as strings or numbers depending on their defintion; however, they are indexed as strings)_
_(keys are treated as strings or numbers depending on their definition; however, they are indexed as strings)_
Count customers in Atlanta (using JSON containment):
@ -103,4 +103,4 @@ do! Delete.byJsonPath "customer" """$.City ? (@ == "Chicago")"""
## More Information
The [project site](https://bitbadger.solutions/open-source/relational-documents/) has full details on how to use this library.
The [project site](https://relationaldocs.bitbadger.solutions/dotnet/) has full details on how to use this library.

View File

@ -13,7 +13,7 @@ This package provides a lightweight document library backed by [SQLite](https://
## Upgrading from v3
There is a breaking API change for `ByField` (C#) / `byField` (F#), along with a compatibility namespace that can mitigate the impact of these changes. See [the migration guide](https://bitbadger.solutions/open-source/relational-documents/upgrade-from-v3-to-v4.html) for full details.
There is a breaking API change for `ByField` (C#) / `byField` (F#), along with a compatibility namespace that can mitigate the impact of these changes. See [the migration guide](https://relationaldocs.bitbadger.solutions/dotnet/upgrade/v4.html) for full details.
## Getting Started
@ -103,4 +103,4 @@ do! Delete.byFields "customer" Any [ Field.Equal "City" "Chicago" ]
## More Information
The [project site](https://bitbadger.solutions/open-source/relational-documents/) has full details on how to use this library.
The [project site](https://relationaldocs.bitbadger.solutions/dotnet/) has full details on how to use this library.