Blog - Popsink vs Fivetran HVR for IBM i (AS/400) replication

‍

Introduction: Why the Capture Method Matters for Both IT and Business

‍

IBM i (formerly known as AS/400) continues to run core processes in key industries like finance, insurance, logistics, or manufacturing, where reliability is non-negotiable. The operational data these systems generate is most valuable when it reaches analytics platforms, AI pipelines, and operational dashboards in near real time. The method used to capture and replicate that data directly affects latency, system load, scalability, and the ability to act on information when it matters most.

‍

The gap between a low-latency, high-fidelity stream and a slower, CPU-heavy process can decide:

- If a fraud detection system blocks a suspicious transaction within seconds or only after it’s too late.

- If planning dashboards show current inventory or outdated figures.

- If the production system stays responsive under heavy change volumes or suffers degraded performance from CDC overhead.

‍

For IT leaders, the replication method directly impacts system reliability, scalability, and operational risk. For business leaders, it affects speed to insight, responsiveness to events, and competitive advantage. This post compares two approaches for IBM i Change Data Capture (CDC):

‍

- Fivetran HVR (using ODBC + SQL for both snapshots and change capture - through QSYS2.DISPLAY_JOURNAL user-defined table function)

‍

- Popsink (leveraging JDBC + SQL for snapshots and raw binary streaming for change capture)

‍

The focus is on output quality, throughput, CPU footprint, and reliability at scale: the factors that matter most when CDC is part of your operational core.

‍

Architecture & Methodology

‍

Capability	Popsink (Journal RPC API)	Fivetran HVR (ODBC + SQL)
Data source	Native IBM i journal binaries (attached to physical files)	User-Defined Table Function for journal query
Access method	RPC - direct access via JT400	ODBC - SQL queries over Db2 for i
Data fidelity	Full binary journal payload (with some filtering)	Certain fields may be returned as `POINTER` values
Offset tracking	Exact receiver offsets for continuous streaming	Row-based polling via SQL result sets
Resource use pattern	Low SQL engine involvement; lightweight parsing	SQL engine materializes rows; added parsing overhead

‍

Performance & Throughput

‍

Order-of-Magnitude Example: 100,000 journaled row changes per minute

‍

Metric	Popsink (RPC API)	Fivetran HVR (ODBC + SQL)
End-to-end latency (from commit to publish)	50–200 ms	1–5 s typical; spikes higher under load
Sustained throughput (example: mid-range Power9)	>50k changes/sec	5–10k changes/sec before lag builds up
Network payload size	Compact binary format	Expanded SQL row format with metadata
Backpressure behavior	Continuous streaming; stable under burst load	May queue inside SQL engine; lag increases with bursts

‍

Why RPC Wins:
By reading journal receivers directly, Popsink avoids the SQL engine’s row materialization step. This cuts CPU cycles and keeps the stream closer to real-time, even as volume spikes.

‍

CPU Impact on the IBM i

‍

Load Profile	Popsink (RPC API)	Fivetran HVR (ODBC + SQL)
Low Volume (<10k changes/min)	<2% CPU on typical partition	~5% CPU (due to SQL processing overhead)
Moderate Volume (~50k changes/min)	~5% CPU	15–20% CPU
High Volume (100k+ changes/min)	10–15% CPU	30%+ CPU; potential impact on production workloads

‍

Source: Estimates based on IBM’s documented overhead for DISPLAY_JOURNAL vs QjoRetrieveJournalEntries in high-volume tests as reported by user tests and industry case studies.

‍

Reliability at Scale

‍

Factor	Popsink (RPC API)	Fivetran HVR (ODBC + SQL)
Receiver gaps	Tracks exact offsets in journal receivers; no silent skips	Polling risk if lag occurs or receiver is deleted mid-query
Fidelity under load	Full journal payload preserved	Some fields may return as `POINTER`; requires extra queries
Large transactions	Handled as a continuous multi-entry stream	SQL may buffer large result sets, increasing lag
Operational risk	Minimal impact on Db2 query engine	Competes with production queries for SQL engine resources

‍

Permissions & Security

‍

Change Data Capture on IBM i requires explicit authority to journals and their receivers.

Popsink the user profile needs:

- USE and OBJEXIST authority on the journal object.

- EXECUTE authority on the journal library.

- USE authority on journal receivers.

- No SQL-level SELECT authority on system UDTFs is required, since Popsink calls QjoRetrieveJournalEntries directly.

Fivetran HVR the user profile needs:

- ODBC access via DRDA.

- SELECT privileges on QSYS2.DISPLAY_JOURNAL.

- Inherited authority to the underlying journal and receivers.

- Additional ODBC permissions for metadata queries.

In simpler terms: Popsink only gets the keys it needs to read the journal, while Fivetran HVR requires a master key that grants access to the UDTF and automatically inherits the permissions the function itself has been granted.

‍

Strategic Considerations

‍

Business Continuity‍

High CPU impact on IBM i can jeopardize transaction processing. Popsink’s lighter footprint leaves more headroom for business-critical workloads.

‍

Real-Time Decisioning‍

Whether for fraud prevention, just-in-time supply chain, or live analytics, a 2–5 second lag can mean lost opportunities or risk exposure. Popsink’s sub-second potential is a strategic enabler.

‍

Scalability‍

SQL-based CDC often scales non-linearly, doubling change volume can more than double CPU usage. RPC-based capture scales linearly and predictably.

‍

Audit & Compliance‍

Regulatory scenarios (e.g., SOX, PSD3) require exact replication of source changes. Popsink preserves all journal data, eliminating ambiguity.

‍

Security implication

‍With Popsink, privileges are limited to the exact journals being read, reducing the attack surface compared to granting broad ODBC/SQL access.

‍

Conclusion

‍

Both Popsink and HVR can replicate from IBM i to modern platforms, but how they capture changes has decisive effects on performance, scalability, and business outcomes.

- Popsink’s journal RPC method offers higher throughput, lower CPU impact, and full fidelity - making it the clear choice for high-volume, latency-sensitive, and business critical workloads.

- HVR’s SQL-based method is more CPU-intensive and slightly less complete but can be viable for less critical workloads where CPU usage and business continuity are not critical.

In environments where IBM i remains mission-critical and operational risk is tightly managed, the replication method is not just a technical detail - it’s a strategic choice that determines whether CDC is an enabler or a bottleneck.

‍

Popsink vs Fivetran HVR for IBM i (AS/400) replication