insert_or_ignore

class sqlalchemy_upsert_kit.sqlite.insert_or_ignore.InsertOrIgnoreExecutor(engine: sqlalchemy.engine.base.Engine, table: sqlalchemy.sql.schema.Table, values: list[dict[str, Any]], metadata: sqlalchemy.sql.schema.MetaData | None, temp_table_name: str | None, conn: sqlalchemy.engine.base.Connection | None, trans: sqlalchemy.engine.base.Transaction | None, columns: list[str] | None, _raise_on_temp_table_create: bool, _raise_on_temp_data_insert: bool, _raise_on_target_delete: bool, _raise_on_target_insert: bool, _raise_on_temp_table_drop: bool, _raise_on_merge_update: bool, _ignored_rows: int = 0, _replaced_rows: int = 0, _updated_rows: int = 0, _inserted_rows: int = 0, _temp_table_created: bool = False)[source]
apply_strategy(conn: Connection, trans: Transaction)[source]

Apply the upsert strategy-specific logic.

This abstract method must be implemented by subclasses to define their specific upsert behavior. The method is called after the temporary table has been created and populated with candidate data.

Implementation Requirements:

Subclasses should implement the core database operations that define their upsert strategy:

  • INSERT OR IGNORE: Use LEFT JOIN to insert only non-conflicting records

  • INSERT OR REPLACE: Delete conflicting records, then insert all records

  • UPSERT/MERGE: Update existing records, insert new ones

State Management:

Implementations should update the appropriate result counters:

  • self._ignored_rows - Records ignored (INSERT OR IGNORE)

  • self._replaced_rows - Records replaced (INSERT OR REPLACE)

  • self._updated_rows - Records updated (UPSERT/MERGE)

  • self._inserted_rows - New records inserted

Error Handling:

Implementations can use testing flags for controlled failure simulation:

  • self._raise_on_target_delete - Simulate deletion failures

  • self._raise_on_target_insert - Simulate insertion failures

Parameters:
  • conn – Database connection within active transaction

  • trans – Active transaction context

Raises:

Example Implementation:

def apply_strategy(self, conn, trans):
    # INSERT OR IGNORE strategy
    stmt = self.table.insert().from_select(
        list(self._temp_table.columns.keys()),
        sa.select(self._temp_table).select_from(
            self._temp_table.outerjoin(self.table, ...)
        ).where(self.table.c[self.pk_name].is_(None))
    )
    result = conn.execute(stmt)
    self._inserted_rows = result.rowcount or 0
    self._ignored_rows = len(self.values) - self._inserted_rows
sqlalchemy_upsert_kit.sqlite.insert_or_ignore.insert_or_ignore(engine: Engine, table: Table, values: list[dict[str, Any]], metadata: MetaData | None = None, temp_table_name: str | None = None, conn: Connection | None = None, trans: Transaction | None = None, _raise_on_temp_table_create: bool = False, _raise_on_temp_data_insert: bool = False, _raise_on_target_insert: bool = False, _raise_on_temp_table_drop: bool = False) tuple[int, int][source]

Perform high-performance bulk INSERT-IF-NOT-EXISTS operation using temporary table.

This function performs conditional bulk insertion: only inserts records whose primary keys don’t already exist in the target table. This is equivalent to “INSERT IGNORE” or “INSERT … ON CONFLICT DO NOTHING” but works more efficiently.

Algorithm:

  1. Creates temporary table and loads all candidate data

  2. Uses LEFT JOIN to identify records not in target table

  3. Bulk inserts only the non-conflicting records

  4. Cleans up temporary resources

This approach is ideal for:

  • Incremental data loading where duplicates should be ignored

  • ETL processes that need idempotent behavior

  • Syncing data from external sources

Transaction Management:

This function supports both auto-managed and user-managed transaction modes. See the module-level documentation for detailed explanations of each mode.

Parameters:
  • engine – SQLAlchemy engine for database connection

  • table – Target table for conditional insertion

  • values – Records to insert if they don’t exist. Must include primary key values for conflict detection.

  • metadata – Optional metadata instance for temporary table isolation. If None, a new MetaData instance is created for clean separation.

  • temp_table_name – Optional custom name for temporary table. If None, generates unique name with timestamp to avoid conflicts.

  • conn – Optional database connection for user-managed transaction mode. Must be provided together with trans parameter.

  • trans – Optional transaction for user-managed transaction mode. Must be provided together with conn parameter.

Returns:

Tuple of (ignored_rows, inserted_rows): - ignored_rows: Number of records that were not inserted (already existed) - inserted_rows: Number of new records successfully inserted

Raises:
  • ValueError – When conn and trans parameters are provided inconsistently (one is None while the other is not)

  • UpsertTestError – When testing flags are enabled and corresponding operations fail

Examples:

Auto-managed transaction (default mode):

# Function manages its own transaction
ignored, inserted = insert_or_ignore(engine, users_table, new_data)

User-managed transaction mode:

# Operation is part of larger transaction
with engine.connect() as conn:
    with conn.begin() as trans:
        # Other operations...
        ignored, inserted = insert_or_ignore(
            engine, users_table, new_data, conn=conn, trans=trans
        )
        # More operations...

Conflict detection example:

# Target table has records with id=1,2,3
new_data = [
    {'id': 2, 'name': 'Bob'},      # Exists - will be ignored
    {'id': 4, 'name': 'Charlie'},  # New - will be inserted
    {'id': 5, 'name': 'David'},    # New - will be inserted
]
ignored, inserted = insert_or_ignore(engine, users_table, new_data)
# Result: ignored=1, inserted=2

Note

Parameters prefixed with _raise_on_ are exclusively for testing error handling and cleanup behavior. Never use these in production code.