From 1575d828bfd8caaa15e672190c199488ec38103b Mon Sep 17 00:00:00 2001 From: "Daniel J. Summers" Date: Fri, 18 Apr 2025 18:52:30 -0400 Subject: [PATCH] WIP on getting started doc --- .idea/libraries/Maven__scala_sdk_3_5_2.xml | 26 ------- docs/getting-started.md | 88 ++++++++++++++++++++++ 2 files changed, 88 insertions(+), 26 deletions(-) delete mode 100644 .idea/libraries/Maven__scala_sdk_3_5_2.xml diff --git a/.idea/libraries/Maven__scala_sdk_3_5_2.xml b/.idea/libraries/Maven__scala_sdk_3_5_2.xml deleted file mode 100644 index edaffc7..0000000 --- a/.idea/libraries/Maven__scala_sdk_3_5_2.xml +++ /dev/null @@ -1,26 +0,0 @@ - - - - Scala_3_5 - - - - - - - - - - - - - - - - file://$MAVEN_REPOSITORY$/org/scala-lang/scala3-sbt-bridge/3.5.2/scala3-sbt-bridge-3.5.2.jar - - - - - - \ No newline at end of file diff --git a/docs/getting-started.md b/docs/getting-started.md index 1900b27..9bf7eb8 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -1,5 +1,89 @@ # Getting Started +## Determining the Lay of the Land + +While the aim is to provide a library that makes it easy to get up-and-running, thinking through some decisions up front will pay off as we go. + +- Each module has document access functions which take a JDBC `Connection` as its last parameter, and a version of those functions which do not. For the latter, the library will create a connection for each execution. For all languages except Java, there are also extension methods on the `Connection` object itself. We'll discuss connection management more below. +- Document IDs default to the `id` property for the given domain object/document, but can be named whatever you like. The library can also generate three forms of automatic IDs for documents. As with connection management, more discussion and considerations will be presented below. +- No assumption is made as to how documents are serialized and deserialized. While this allows the library to also have no dependencies on any (possibly conflicting) JSON library, it does require a two-method interface implementation. (The exception here is `kotlinx`, which does have a built-in serializer based on `kotlinx`.) + +Projects will need a dependency on their chosen library. The `core` module is a dependency of the other modules, so Groovy, Scala, and KotlinX projects only need the `groovy`, `scala`, or `kotlinx` dependencies, respectively. Java or reflection-based-serialization Kotlin projects can depend on `core` directly. + +## Connections + +Any database library will need a connection string, and this one is no different. If you are using a library or framework which provides a way to configure connection strings, use that. If not, a great way to configure connections (or sensitive parts of it) is via environment variables. Java's `System.getenv(key)` static method can [read the value of an environment variable][env]. + +However the connection string is configured, the library needs to know about it. The `Configuration` class (found in the `solutions.bitbadger.documents` package, part of the `core` module) has a static property `connectionString`; set it to the connection string you have configured. + +### Connection Management + +Once the connection string has been configured, `Configuration.dbConn()` will return new connections to that database. Combining this with extension methods provides several options: + +- **Do nothing.** This will result in a new `Connection` being obtained for each request, which may seem crazy! However, SQLite connections are local actions, and pooling PostgreSQL connections can mitigate the overhead of multiple PostgreSQL connections required to satisfy a particular application action. +- **Use a connection from your DI container.** Combined with the extension methods (or functions with a `Connection` parameter), this can be a great way to introduce documents into an existing application. All queries will be executed on the given connection, and the DI container can manage the lifetime (in the context of web requests, likely per-request). +- **Configure this library to provide the DI container's connection.** If you can set up your container to run custom code to return its objects (i.e., factories), `Configuration.dbConn()` can be treated as a connection factory. + +> [!NOTE] +> Those are a lot of options (and are missing ad-hoc / hybrid options). On the other hand, this is a low-stress decision for those getting started. For some, one of those options will trigger the "Yeah, that's it!" response; in that case, go with that. In others, pick one and get started. For web applications, the DI-provided connection is a good choice. The library still needs to be configured so it knows what type of database it is targeting, but the connection does not have to be provided by the library; any JDBC connection will do. + +## Document IDs + +### Naming IDs + +As mentioned above, the default configuration is to use the document field `id` as the identifier for each document. For projects who want to use a different name (e.g., `key`), set the `Configuration.idField` property to whatever value will be used. + +Unlike the connection strategy, this is a decision to make up front; once documents exist, this cannot be easily changed. + +### Automatic IDs + +Relational databases provide several ways to create automatic IDs, the most common being ever-increasing numbers or UUIDs/GUIDs. This library provides a replacement (or approximation) of these options, all defined in the `AutoId` enum. + +- `DISABLED` - no automatic IDs are applied; your IDs are your ~~problem~~ responsibility. +- `NUMBER` - a `MAX + 1`-style algorithm is applied if the document has a numeric ID with the value `0`. _(This is applied as a subquery on the `INSERT` statement; it should not be considered nearly as robust as a sequence.)_ +- `UUID` - a `String` UUID is generated for documents with blank string ID fields. +- `RANDOM_STRING` - a string of random hex characters is generated for documents with blank string ID fields; the length of this string is controlled by `Configuration.idStringLength`. + +In all automatic generation cases, if the document being inserted has an ID value already, it is passed through unmodified. + +> [!WARNING] +> For `NUMBER` auto IDs, both PostgreSQL and SQLite will have trouble if any document with a string ID is written. Numbers can be treated as strings, but strings cannot be treated as numbers. (SQLite will do its best - if a string has a numeric value, it will work - but PostgreSQL will fail spectacularly in this case.) + +> [!TIP] +> `AutoId.generateRandomString(length)` can be used to generate random hex strings of a specified length, not just the one specified in the configuration. Also, `AutoId.generateUUID()` can be used to generate a lowercase UUID with no dashes, regardless of the configured `AutoId` values. +> +> (Non-Kotlin projects may need to specify `AutoId.Companion` to see these functions.) + +## Document Serialization + +### Traditional (AKA "reflection-based") + +With many applications already defining a JSON API, a document data store can utilize whatever JSON serialization strategies these applications already employ. In this case, implementing a `DocumentSerializer` (found in the `solutions.bitbadger.documents` namespace) is trivial; its methods can delegate to the existing serialization and deserialization process. + +For new applications, or applications that do not already have JSON serialization as part of their normal process, the integration tests for the `core`, `groovy`, and `scala` modules have examples of a `DocumentSerializer` implementation using Jackson's default options. The project will need a dependency on `jackson.databind`, but that implementation is trivial (thus why it's duplicated in each module's integration tests). + +Once the serializer is created, set `DocumentConfig.serializer` property to an instance of that serializer. (`DocumentConfig` is in the `solutions.bitbadger.documents.java` package.) +### Using `kotlinx.serialization` + +The `kotlinx` module configures the serializer with the following default options: + +- Coerce Input Values = true; this means that `null` values in JSON will be represented by the class's default property value rather than being `null`. +- Encode Defaults = true; this means properties with default values will have those values encoded as part of the output JSON. +- Explicit Nulls = false; this means that `null` values will not be written to the output JSON. For documents with many optional values, this can make a decent size difference once many documents are stored. + +Any of the [KotlinX Json][kx-json] properties can be set on the `options` property of `DocumentConfig` in the `solutions.bitbadger.documents.kotlinx` package. As with reflection-based serialization, if the project already has a set of `Json` properties, the existing configuration can be replaced with that set. + +## Document Tables + +> [!NOTE] +> If you want to customize the document's `id` field, this needs be done before tables are created. + +The final step to being able to store and retrieve documents is to define one or more tables for them. + +// TODO: stopped here + +## Previous Instructions + The exceedingly quick version: - Add the desired module/package to your dependencies @@ -15,3 +99,7 @@ From there, `Definition.ensureTable(name)` will create a table, and you are read ## Examples Each library has an exhaustive suite of integration tests; reviewing those may also provide insight into the patterns used for effective document storage and retrieval. + + +[env]: https://docs.oracle.com/javase/tutorial/essential/environment/env.html "Environment Variables • The Java Tutorials" +[kx-json]: https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/json.md "JSON Features   kotlinx.serialization • GitHub"