All notable changes to this project will be documented in this file.
✨ Features
-
New package
ETLBox.MongoDB:MongoChangeStreamSource<TOutput>tails a MongoDB change stream and emits one record per change event. Requires a replica set deployment. Accepts a caller-providedIMongoClient, an optional aggregationPipelinefor server-side filtering, and (for resumable processing) aCheckpointStore+CheckpointId. To checkpoint, surface the change-stream resume token on the mapped output so theCheckpointWritercan commit it (see below). -
New package
ETLBox.PostgresStreaming:PostgresXminTailSource<TOutput>continuously polls a PostgreSQL table usingxmin-frontier polling (pg_snapshot_xmin(pg_current_snapshot())). Rows inserted by in-flight transactions are excluded from each batch and automatically picked up once their transaction commits. Supports cursor pagination viaOrderByColumns, server-side predicate filtering viaAdditionalWhere, and resumable processing viaCheckpointStore+CheckpointId. To stream UPDATEs (not just INSERTs) the cursor column must be re-stamped on every write (e.g. abigintfilled by a server-side sequence); use a server-side value, not an app-generated one, or concurrent writers can defeat the frontier. -
New: at-least-once checkpointing in
ETLBox.Common.DataFlow.Streaming.ICheckpointStore<TPosition>(where TPosition : IComparable<TPosition>) persists a typed, monotone stream position keyed bycheckpointId— one stream can be tailed by many independent consumers, each with its own checkpoint (the Kafka consumer-group model).CheckpointWriter<TInput, TPosition>is a terminal destination placed after the real destination; it commits the position (extracted from the record via aPositionselector) only once a record has been durably written downstream, advancing strictly forward. A crash between the destination write and the commit replays the record (a duplicate) rather than dropping it — at-least-once; consumers must be idempotent. For a co-located destination + checkpoint, callCommitAsyncinside the destination's transaction for effective exactly-once.DbCheckpointStore<TPosition>is a ready-made store over an ETLBoxIConnectionManager(configurable table/key/position columns; positions stored natively).- The sources are load-only: they load the committed position on start and never commit it
themselves. Implement
ICheckpointStore<TPosition>for any backend (Redis, database, file, …).
-
New:
DataFlowResourceshelper class inETLBox.Serialization. Provides a composable, thread-safe implementation of dataflow resource ownership — theIDataFlowconnection manager pool and theIDataFlowResourceOwnerdisposable resource pool. Embed it as a field and delegateGetOrAddConnectionManager/GetOrAddResourceto it to avoid re-implementing theConcurrentDictionaryboilerplate in everyIDataFlowimplementor. -
New:
IDataFlowResourceOwner— an optional capability interface that declares the full resource-ownership contract:GetOrAddConnectionManager(...)andGetOrAddResource(string key, Func<IDisposable> factory)(plus the genericDataFlowResourceOwnerExtensions.GetOrAddResource<T>extension).GetOrAddConnectionManageralso stays onIDataFlowfor backward compatibility, so a data flow implementing both interfaces satisfies them with one method — and the composableDataFlowResourceshelper now implements every resource method through this single interface rather than exposing a contract-less public method. ExposingGetOrAddResourcehere rather than directly onIDataFlowis what keeps disposable-resource ownership from binary-breaking existing externalIDataFlowimplementations compiled against earlier versions.DataFlowXmlReaderprobes for the capability (is IDataFlowResourceOwner) and, when present, automatically registersIDisposablecomponent properties with the owning flow — components with identical XML configuration share a single instance (deduplicated by type + content key) and are disposed with the flow. When the flow does not implementIDataFlowResourceOwner, the reader falls back to plain instance creation (no dedup, no flow-owned disposal), preserving pre-existing behavior. This applies to both concrete class properties (e.g.,MongoClient) and abstract/interface properties that resolve to anIDisposableimplementation. -
New:
ILifetimeAwareActivator— an optionalIDataFlowActivatorcapability that reports whether instances of a type are owned by an external scope (e.g. a DI container). When a disposable property resolves to a container-managed service (ServiceProviderActivatorover a registered type), the data flow no longer takes ownership of it — the container's lifetime applies and the flow does not dispose it. Instances created fresh by the activator (e.g.DefaultDataFlowActivator) remain flow-owned and are disposed with the flow.
🐛 Bug Fixes
-
Fixed (SUPPORT-56620):
KafkaTransformationnow logsDeliveryReporterrors through the standard ETLBox logger when a Kafka broker reports a delivery failure, so per-message failures surface instead of being swallowed by the producer callback. -
Fixed (SUPPORT-56620):
KafkaTransformation.CleanUpnow disposes the underlying producer in afinallyblock, so aFlushthat throws still releases the producer instead of leaking it. The experimentalFlushTimeoutproperty and the "Kafka flush timed out" exception were removed —Flush()runs with the Confluent driver's default behavior again. -
Fixed (RSSL-11704):
PostgresXminTailSourcenow quotesOrderByColumnsidentifiers in the generated SQL viaAppendQuotedColumnList. Mixed-case columns created with quoted identifiers (e.g."StreamPosition") previously failed inORDER BYand tuple-cursorWHEREclauses because PostgreSQL folds unquoted identifiers to lowercase.IDataRecordlookups (GetOrdinal) stay case-insensitive and continue to use bare names.
🔧 Internal
ETLBox.MongoDBupgradesMongoDB.Driverfrom2.28.0to3.8.0and retargets fromnetstandard2.0tonet6.0. MongoDB.Driver 3.x droppednetstandard2.0support, so consumers ofETLBox.MongoDBnow neednet6.0or newer; the rest of the ETLBox libraries are unaffected.- CI:
test_jobmigrated from Docker-in-Docker to KubeDock (SYSOPS-1668), Testcontainers Ryuk is disabled per-repo (SYSOPS-1667), andhotfix/*branches inherit their version from the nearest source branch instead of always patchingmaster.
✨ Features
-
New: flat XML sequence syntax for
ETLBox.SerializationviaPipeline<TIn, TOut>and the non-genericPipeline. A<Pipeline>can now list sources, transformations, and destinations in execution order instead of requiring deeply nested<LinkTo>elements. Existing nested<LinkTo>XML remains supported.Example:
<EtlDataFlowStep> <MemorySource> <LinkTo> <Pipeline> <JsonTransformation /> <ScriptedTransformation /> <MemoryDestination /> </Pipeline> </LinkTo> </MemorySource> </EtlDataFlowStep>
-
New:
IDataFlowXmlSerializableandIDataFlowXmlContextextension points inETLBox.Serialization. Components can now take control of their XML deserialization while still creating child objects through the reader's DI-aware factory. -
New:
PassThroughproperty onJsonTransformation. Whentrue, all input fields are copied to the output beforeMappingsare applied, allowing mappings to add new fields or override copied ones. Whenfalse(default), only mapped fields are emitted. -
New:
JsonTransformation.ParseNative(string)and native JSON object conversion. Mappings withPath="$"now return a nativeExpandoObjectinstead of a JSON string, with nested objects, arrays, numbers, booleans, dates, and nulls converted to .NET values.
🐛 Bug Fixes
-
Fixed:
JsonTransformationnow returnsnullwhen a JSONPath does not match any token. -
Fixed:
Pipelinecompletion handling for XML flows without an externalLinkTo. Pipeline output is drained automatically when needed so execution can complete without hanging. -
Fixed: root-level
<Pipeline>execution tracking inDataFlowXmlReader. A pipeline used as the root source is registered for completion tracking even when it contains no external destination. -
Fixed: pipeline step type validation for components that implement more than one
IDataFlowLinkTarget<T>interface, such as batched destinations. -
Fixed:
DataFlowXmlReadercontext type resolution now catches expected lookup exceptions when a custom XML-serializable component probes for optional child types.
🔧 Internal
- CI package versioning now uses
GitVersion_SemVerfor NuGet packages and a separate assembly version with the GitLab pipeline IID. - Updated
GitVersion.ymlbranch rules for1.18.0prerelease and hotfix flows. - Changed the shared C# language version setting from
12tolatest.
✨ Features
-
New:
AdditionalImportsproperty onScriptedRowTransformation<TInput, TOutput>. Accepts a list of namespaces to import into everyMappingsexpression — equivalent tousingdirectives. For example, adding"System.Text.Json"allows writingJsonSerializer.Serialize(…)instead of the fully qualifiedSystem.Text.Json.JsonSerializer.Serialize(…). -
Improvement:
AdditionalAssemblyNamesonScriptedRowTransformation<TInput, TOutput>now accepts both file paths (e.g.Files/MyLib.dll) and runtime assembly names (e.g.System.Text.Json). Previously only file paths were supported, making it impossible to reference system assemblies already loaded in the process. -
New:
NullableContextOptionsproperty onScriptedRowTransformation<TInput, TOutput>(and the non-generic aliasScriptedTransformation). Controls the nullable annotation context for compiledMappingsexpressions. Defaults toNullableContextOptions.Disablefor backward compatibility. Set toNullableContextOptions.Enableto use nullable annotations such asstring?and the null-conditional operator?.inside scripts.TypedScriptBuilder.WithNullableContextOptions(NullableContextOptions)is also available for direct users of the low-level scripting API.
🐛 Bug Fixes
-
Fixed:
AdditionalAssemblyNameswas silently ignored when using typedTInput/TOutput(i.e. any non-ExpandoObjectpair). Additional assemblies were only passed to the script compiler on the dynamic path; the typed path omitted theWithReferencescall, so scripts referencing types from external assemblies would fail to compile. -
New:
PassThroughproperty onScriptedRowTransformation<TInput, TOutput>(and the non-generic aliasScriptedTransformation). Whentrue, all input fields are copied to the output beforeMappingsare applied — fields not listed inMappingsare preserved unchanged.Mappingscan still add new fields or override copied ones. Whenfalse(default), only fields explicitly listed inMappingsappear in the output.Example XML usage (
PassThroughmode):<ScriptedTransformation> <PassThrough>true</PassThrough> <Mappings> <!-- Adds new field FullName; original fields FirstName and LastName are preserved --> <FullName>$"{FirstName} {LastName}"</FullName> <!-- Overrides existing field Amount --> <Amount>Amount * 1.2</Amount> </Mappings> <LinkTo> <MemoryDestination /> </LinkTo> </ScriptedTransformation>
🐛 Bug Fixes
- Fixed:
ArgumentOutOfRangeExceptionduring XML deserialization ofDbMergewhen usingServiceProviderActivator. WhenILoggerwas registered in DI,ServiceProviderActivatorresolvedDbMergevia theDbMerge(ILogger)constructor which leftBatchSize = 0. The subsequentset_TableNameimmediately created an internalDbDestination(batchSize: 0), which triggeredBatchBlockcreation withBoundedCapacity = 0 * 3 = 0, causing the exception.DataFlowBatchDestination.BatchSizesetter now treatsvalue <= 0as "not set" (storesnull), soInitObjectsusesDefaultBatchSize = 1000- Same fix applied to
RowBatchTransformation.BatchSize(same vulnerable pattern) DbMerge.BatchSizechanged from auto-property to backing-field property initialized toDefaultBatchSize; settingBatchSizeafterTableNamenow propagates to internalDestinationTable
✨ Features
- New: DI-based activator mode for
DataFlowXmlReader. IntroducedIDataFlowActivatorabstraction with two implementations:DefaultDataFlowActivator(wraps existingActivator.CreateInstance()behavior)ServiceProviderActivator(resolves types viaIServiceProvider, falling back toActivatorUtilities.CreateInstancefor unregistered types)
- New:
IServiceCollectionregistration extensions for each ETLBox library:AddEtlBoxCore()— registers all core sources, transformations, and destinations (open generics and non-generic shorthands)AddEtlBoxJson(),AddEtlBoxKafka(),AddEtlBoxRabbitMq(),AddEtlBoxRest(),AddEtlBoxScripting(),AddEtlBoxAI(),AddEtlBoxSerialization()
- New:
ILogger<T>constructor overloads added to all data flow steps (sources, transformations, destinations) across core and extension libraries. Base class hierarchy (GenericTask→DataFlowTask→ intermediate bases) forwardsILoggervia optional parameter chaining. Enables structured logging with proper log category resolution when components are resolved via DI.
🔧 Internal
- Removed
FluentAssertionsdependency from all test projects. All ~208 assertion calls across 9 files migrated to xUnitAssert, unifying on a single assertion style across the solution. - Refactored
DataFlowActivatorstatic class intoDefaultDataFlowActivatorimplementingIDataFlowActivator DataFlowXmlReadernow accepts an optionalIDataFlowActivatorto control how types are instantiated during XML deserialization- Added
Microsoft.Extensions.DependencyInjection.Abstractionsdependency toETLBox.CommonandETLBox.Serialization
✨ Features
- Refactoring: Moved
DataFlowBatchDestinationfrom EtlBox.Classic to EtlBox.Classic.Common for third-party developers to create batched transformations.
✨ Features
- Improvement: Add to
DataFlowXmlReaderinETLBox.Serializationlibrary ability to deserializeIDictionary<string,object>type fromDataFlowXML. - Improvement: Changed type of
PromptParameterssetting inAIBatchTransformation.PromptParametersnow haveIDictionary<string,object>type with custom parameters for liquid-based Propmpt template to use it in render directly.
✨ Features
- Improvement:
AIBatchTransformationnow supportsPromptParametersstring setting, that contains json dictionary with custom parameters for liquid-based Propmpt template.
✨ Features
- New library:
ETLBox.AIto apply AI features toDataFlow - New transformation:
AIBatchTransformationto post prompt data to a OpenAI API endpoint and get results
✨ Features
- Improvement: Added
BoundedCapacitytoDataFlowBatchDestinationoptions to restrict buffer size and max memory consumption
🐛 Bug Fixes
- Fixed a memory leak when connection managers were not owned and not disposed.
- Fixed a bug in
ScriptedRowTransformationwhere the dependency injection was not working properly.
Other changes
- Moved back from versionize to scripted version bump in CI/CD pipeline
Other changes
- Version bump and release preparation
✨ Features
- Enhanced data flow process with connection manager pooling for better resource management
- Improved memory management and connection disposal
🐛 Bug Fixes
- Fixed vulnerabilities in dependencies (RSSL-10261)
- Added proper connection manager disposal to prevent memory leaks
Other changes
- Improved test debugging under .NET 8 SDK
- Updated documentation and TODO items
Other changes
- Build improvements and dependency updates
Other changes
- Added script to append GitLab changelog trailer to commits
- CI/CD pipeline improvements
🐛 Bug Fixes
- Removed duplicating
<Version>tags from project files
Other changes
- Updated CI pipeline to handle version bump commits and renamed deploy job
Other changes
- Added version bump script and updated CI pipeline configuration
- Improved CI/CD automation
Other changes
- Updated CHANGELOG.md and documentation
Other changes
- Minor release with internal improvements
Other changes
- Minor release with internal improvements
Other changes
- Minor release with internal improvements
✨ Features
- New transformation: SqlRowTransformation, SqlCommandTransformation to run parametrised SQL queries/commands
- New transformation: KafkaTransformation producing to Kafka topics
- New transformation: RabbitMqTransformation publishing to RabbitMq queues
- Improvement: RestTransformation now returns the response body as a string and HTTP code
🐛 Bug Fixes
- DataFlowXmlReader: Fix to allow
<[CDATA[..]]>in XML data - DataFlowXmlReader: Added support for floating point properties
- DbRowTransformation: Fixed connection leak
- Updated dependencies with vulnerabilities
Other changes
- Migrated from manual versioning to versionize
- Update README.md
✨ Features
- Added cancelation support for long running data flow processes
- New connection type: Added support for Clickhouse columnar store
- New source: Added Kafka topic support as a source
- New transformation: RestTransformation to post data to a REST endpoint and get results
- New transformation: JsonTransformation to evaluate Json path expressions and extract data from Json
- New transformation: ScriptedRowTransformation to evaluate C# expressions to transform data
Other changes
- DbTransformation renamed to DbRowTransformation (DbTransformation is kept as
Obsolete) - NLog replaced with Microsoft.Extensions.Logging (except when logs are written to DB table, NLog is kept as internal implementation)
✨ Features
- Added DataFlowXmlReader, allowing saving data flow graph configuration as XML