Skip to main content

Webhook Retry Logic

Learn how Omise handles webhook retries, implement idempotency to handle duplicate deliveries, and build reliable webhook processing systems.

Overviewโ€‹

Webhook delivery is not guaranteed on the first attempt. Network issues, server downtime, or processing errors can cause failures. This guide covers:

  • Omise's automatic retry schedule and exponential backoff
  • Implementing idempotency to handle duplicate events
  • Logging and monitoring webhook deliveries
  • Manual retry procedures
  • Best practices for handling failures

Retry Scheduleโ€‹

How Omise Retries Webhooksโ€‹

When a webhook delivery fails (non-2xx response or timeout), Omise automatically retries with exponential backoff:

Attempt 1: Immediate (initial delivery)
Attempt 2: After 1 minute
Attempt 3: After 5 minutes
Attempt 4: After 15 minutes
Attempt 5: After 1 hour
Attempt 6: After 3 hours
Attempt 7: After 6 hours
Attempt 8: After 12 hours
Attempt 9: After 24 hours
Attempt 10: After 48 hours
Final attempts: Every 48 hours until 7 days elapsed

Retry Behaviorโ€‹

Triggers for retry:

  • HTTP status codes 400-599 (client and server errors)
  • Network timeouts (no response within 10 seconds)
  • Connection failures (DNS errors, connection refused)

Success criteria:

  • HTTP status code 200-299
  • Response received within 10 seconds

Retry period:

  • Retries continue for up to 7 days from initial attempt
  • After 7 days, webhook is marked as permanently failed

Exponential Backoff Explainedโ€‹

// Visualization of retry timing
const retrySchedule = [
{ attempt: 1, delay: 0, cumulative: '0 minutes' },
{ attempt: 2, delay: 60, cumulative: '1 minute' },
{ attempt: 3, delay: 300, cumulative: '6 minutes' },
{ attempt: 4, delay: 900, cumulative: '21 minutes' },
{ attempt: 5, delay: 3600, cumulative: '81 minutes' },
{ attempt: 6, delay: 10800, cumulative: '261 minutes' },
{ attempt: 7, delay: 21600, cumulative: '621 minutes' },
{ attempt: 8, delay: 43200, cumulative: '1341 minutes' },
{ attempt: 9, delay: 86400, cumulative: '2781 minutes' },
{ attempt: 10, delay: 172800, cumulative: '5661 minutes' }
];

console.table(retrySchedule);

Benefits of exponential backoff:

  • Reduces load on failing servers
  • Gives time for issues to be resolved
  • Prevents thundering herd problems
  • Maximizes chance of eventual delivery

Idempotencyโ€‹

Idempotency ensures processing a webhook multiple times has the same effect as processing it once. This is critical because webhooks can be delivered multiple times due to retries.

Why Idempotency Mattersโ€‹

Scenario: Webhook delivery fails after processing but before responding

1. Webhook received
2. Charge marked as paid in database โœ“
3. Email sent to customer โœ“
4. Server crashes before responding โœ—
5. Omise retries webhook
6. Without idempotency:
- Charge marked as paid again (duplicate)
- Customer receives duplicate email โœ—
7. With idempotency:
- Recognize event already processed
- Skip duplicate processing โœ“

Idempotency Patternsโ€‹

1. Event ID Trackingโ€‹

Track processed event IDs to prevent duplicate processing:

// Node.js - Event ID tracking with Redis
const redis = require('redis');
const client = redis.createClient();

async function processWebhookIdempotent(event) {
const eventId = event.id;
const lockKey = `webhook:processing:${eventId}`;
const processedKey = `webhook:processed:${eventId}`;

// Check if already processed
const isProcessed = await client.exists(processedKey);
if (isProcessed) {
console.log(`Event ${eventId} already processed, skipping`);
return { status: 'duplicate', eventId };
}

// Acquire lock to prevent concurrent processing
const lockAcquired = await client.set(lockKey, '1', {
EX: 60, // Expire after 60 seconds
NX: true // Only set if doesn't exist
});

if (!lockAcquired) {
console.log(`Event ${eventId} is being processed, waiting`);
return { status: 'processing', eventId };
}

try {
// Process the webhook
await processWebhook(event);

// Mark as processed (keep for 7 days)
await client.setex(processedKey, 7 * 24 * 3600, JSON.stringify({
processed_at: new Date().toISOString(),
event_key: event.key
}));

return { status: 'success', eventId };
} catch (error) {
console.error(`Error processing event ${eventId}:`, error);
throw error;
} finally {
// Release lock
await client.del(lockKey);
}
}

// Express endpoint
app.post('/webhooks/omise', async (req, res) => {
// Verify signature first
if (!verifySignature(req.body, req.headers['x-omise-signature'])) {
return res.status(401).json({ error: 'Invalid signature' });
}

const event = JSON.parse(req.body.toString());

// Respond immediately
res.status(200).json({ received: true });

// Process idempotently
try {
const result = await processWebhookIdempotent(event);
console.log('Processing result:', result);
} catch (error) {
console.error('Webhook processing failed:', error);
}
});
# Python - Event ID tracking with Redis
import redis
import json
from datetime import datetime, timedelta

redis_client = redis.Redis(host='localhost', port=6379, decode_responses=True)

def process_webhook_idempotent(event):
event_id = event['id']
lock_key = f'webhook:processing:{event_id}'
processed_key = f'webhook:processed:{event_id}'

# Check if already processed
if redis_client.exists(processed_key):
print(f'Event {event_id} already processed, skipping')
return {'status': 'duplicate', 'event_id': event_id}

# Acquire lock
lock_acquired = redis_client.set(lock_key, '1', ex=60, nx=True)
if not lock_acquired:
print(f'Event {event_id} is being processed, waiting')
return {'status': 'processing', 'event_id': event_id}

try:
# Process the webhook
process_webhook(event)

# Mark as processed (keep for 7 days)
processed_data = json.dumps({
'processed_at': datetime.now().isoformat(),
'event_key': event['key']
})
redis_client.setex(processed_key, 7 * 24 * 3600, processed_data)

return {'status': 'success', 'event_id': event_id}
except Exception as e:
print(f'Error processing event {event_id}: {e}')
raise
finally:
# Release lock
redis_client.delete(lock_key)

@app.route('/webhooks/omise', methods=['POST'])
def handle_webhook():
# Verify signature
payload = request.get_data()
signature = request.headers.get('X-Omise-Signature')
if not verify_signature(payload, signature):
return jsonify({'error': 'Invalid signature'}), 401

event = request.get_json()

# Respond immediately
response = jsonify({'received': True})

# Process idempotently in background
try:
result = process_webhook_idempotent(event)
print(f'Processing result: {result}')
except Exception as e:
print(f'Webhook processing failed: {e}')

return response, 200

2. Database Constraintsโ€‹

Use database unique constraints to enforce idempotency:

// Node.js - Database-level idempotency with MongoDB
const mongoose = require('mongoose');

// Webhook event schema with unique constraint
const WebhookEventSchema = new mongoose.Schema({
event_id: {
type: String,
required: true,
unique: true, // Prevents duplicate processing
index: true
},
event_key: { type: String, required: true },
processed_at: { type: Date, default: Date.now },
data: { type: mongoose.Schema.Types.Mixed },
status: {
type: String,
enum: ['processing', 'completed', 'failed'],
default: 'processing'
}
});

const WebhookEvent = mongoose.model('WebhookEvent', WebhookEventSchema);

async function processWebhookIdempotent(event) {
try {
// Try to insert event record
const webhookEvent = new WebhookEvent({
event_id: event.id,
event_key: event.key,
data: event.data
});

await webhookEvent.save();

} catch (error) {
// Duplicate key error - event already processed
if (error.code === 11000) {
console.log(`Event ${event.id} already processed`);
return { status: 'duplicate' };
}
throw error;
}

try {
// Process the webhook
await processWebhook(event);

// Update status to completed
await WebhookEvent.updateOne(
{ event_id: event.id },
{ status: 'completed' }
);

return { status: 'success' };
} catch (error) {
// Update status to failed
await WebhookEvent.updateOne(
{ event_id: event.id },
{ status: 'failed', error: error.message }
);
throw error;
}
}
# Ruby/Rails - Database-level idempotency
class WebhookEvent < ApplicationRecord
validates :event_id, presence: true, uniqueness: true

enum status: { processing: 0, completed: 1, failed: 2 }
end

# Migration
class CreateWebhookEvents < ActiveRecord::Migration[6.1]
def change
create_table :webhook_events do |t|
t.string :event_id, null: false, index: { unique: true }
t.string :event_key, null: false
t.integer :status, default: 0
t.json :data
t.text :error_message
t.timestamps
end
end
end

# Webhook processor
class WebhookProcessor
def self.process_idempotent(event)
webhook_event = WebhookEvent.create!(
event_id: event['id'],
event_key: event['key'],
data: event['data']
)

# Process webhook
process_webhook(event)

webhook_event.update!(status: :completed)
{ status: 'success' }

rescue ActiveRecord::RecordNotUnique
Rails.logger.info "Event #{event['id']} already processed"
{ status: 'duplicate' }
rescue => e
webhook_event&.update!(status: :failed, error_message: e.message)
raise
end
end

3. Idempotency Keys in Business Logicโ€‹

Make operations naturally idempotent:

// Node.js - Idempotent order fulfillment
async function fulfillOrder(charge) {
const orderId = charge.metadata.order_id;

// Use database transaction for atomicity
const session = await mongoose.startSession();
session.startTransaction();

try {
const order = await Order.findById(orderId).session(session);

if (!order) {
throw new Error(`Order ${orderId} not found`);
}

// Check current status - idempotent check
if (order.status === 'fulfilled') {
console.log(`Order ${orderId} already fulfilled`);
await session.commitTransaction();
return { status: 'already_fulfilled', order };
}

if (order.status !== 'paid') {
console.log(`Order ${orderId} not in paid status`);
await session.commitTransaction();
return { status: 'invalid_status', order };
}

// Fulfill order
order.status = 'fulfilled';
order.fulfilled_at = new Date();
await order.save({ session });

// Reduce inventory
for (const item of order.items) {
await Product.updateOne(
{ _id: item.product_id },
{ $inc: { stock: -item.quantity } }
).session(session);
}

// Create shipment
const shipment = new Shipment({
order_id: orderId,
status: 'pending',
items: order.items
});
await shipment.save({ session });

await session.commitTransaction();

// Send email (outside transaction, handle separately)
await sendFulfillmentEmail(order);

return { status: 'success', order };

} catch (error) {
await session.abortTransaction();
throw error;
} finally {
session.endSession();
}
}
# Python - Idempotent subscription extension
from sqlalchemy.orm import Session
from datetime import datetime, timedelta

def extend_subscription(db: Session, charge):
customer_id = charge['customer']

# Get subscription with row lock
subscription = db.query(Subscription).filter(
Subscription.customer_id == customer_id,
Subscription.status == 'active'
).with_for_update().first()

if not subscription:
print(f'No active subscription for customer {customer_id}')
return {'status': 'no_subscription'}

# Check if this charge already processed
payment_exists = db.query(PaymentHistory).filter(
PaymentHistory.charge_id == charge['id']
).first()

if payment_exists:
print(f"Charge {charge['id']} already processed")
return {'status': 'duplicate', 'subscription': subscription}

try:
# Extend subscription
subscription.next_billing_date = subscription.next_billing_date + timedelta(days=30)
subscription.last_payment_date = datetime.now()
subscription.failed_attempts = 0

# Record payment
payment = PaymentHistory(
subscription_id=subscription.id,
charge_id=charge['id'],
amount=charge['amount'],
paid_at=datetime.now()
)
db.add(payment)

db.commit()

# Send confirmation email
send_renewal_confirmation(subscription)

return {'status': 'success', 'subscription': subscription}

except Exception as e:
db.rollback()
raise
// Go - Idempotent transfer processing
func processTransferIdempotent(db *sql.DB, transfer Transfer) error {
tx, err := db.Begin()
if err != nil {
return fmt.Errorf("failed to begin transaction: %w", err)
}
defer tx.Rollback()

// Check if transfer already processed
var count int
err = tx.QueryRow(
"SELECT COUNT(*) FROM processed_transfers WHERE transfer_id = ?",
transfer.ID,
).Scan(&count)

if err != nil {
return fmt.Errorf("failed to check processed transfers: %w", err)
}

if count > 0 {
log.Printf("Transfer %s already processed", transfer.ID)
return nil // Already processed, return success
}

// Process transfer
_, err = tx.Exec(
"UPDATE accounts SET balance = balance + ? WHERE id = ?",
transfer.Amount,
transfer.RecipientID,
)
if err != nil {
return fmt.Errorf("failed to update balance: %w", err)
}

// Mark as processed
_, err = tx.Exec(
"INSERT INTO processed_transfers (transfer_id, processed_at) VALUES (?, ?)",
transfer.ID,
time.Now(),
)
if err != nil {
return fmt.Errorf("failed to mark as processed: %w", err)
}

if err := tx.Commit(); err != nil {
return fmt.Errorf("failed to commit transaction: %w", err)
}

// Send notification (outside transaction)
go sendTransferNotification(transfer)

return nil
}

Handling Failuresโ€‹

Graceful Error Handlingโ€‹

Always respond to webhooks quickly, even when processing fails:

// Node.js - Graceful error handling
app.post('/webhooks/omise', async (req, res) => {
const startTime = Date.now();
let event;

try {
// Verify signature
const signature = req.headers['x-omise-signature'];
if (!signature || !verifySignature(req.body, signature)) {
logger.error('Invalid signature', {
ip: req.ip,
signature: signature
});
return res.status(401).json({ error: 'Invalid signature' });
}

event = JSON.parse(req.body.toString());

// Respond immediately (before processing)
res.status(200).json({ received: true });

// Log receipt
logger.info('Webhook received', {
event_id: event.id,
event_key: event.key,
received_at: new Date().toISOString()
});

} catch (error) {
logger.error('Webhook parsing error', {
error: error.message,
stack: error.stack
});
return res.status(400).json({ error: 'Invalid request' });
}

// Process asynchronously with error handling
processWebhookWithRetry(event)
.then(() => {
logger.info('Webhook processed successfully', {
event_id: event.id,
duration: Date.now() - startTime
});
})
.catch(error => {
logger.error('Webhook processing failed', {
event_id: event.id,
event_key: event.key,
error: error.message,
stack: error.stack,
duration: Date.now() - startTime
});

// Queue for manual review
queueForManualReview(event, error);
});
});

async function processWebhookWithRetry(event, attempt = 1, maxAttempts = 3) {
try {
await processWebhook(event);
} catch (error) {
if (attempt < maxAttempts) {
logger.warn(`Retry attempt ${attempt} for event ${event.id}`);
await sleep(Math.pow(2, attempt) * 1000); // Exponential backoff
return processWebhookWithRetry(event, attempt + 1, maxAttempts);
}
throw error; // Max retries exceeded
}
}

Dead Letter Queueโ€‹

Store failed webhooks for later processing:

# Python - Dead letter queue with Redis
import redis
import json
from datetime import datetime

redis_client = redis.Redis(host='localhost', port=6379, decode_responses=True)

def queue_for_manual_review(event, error):
"""Add failed webhook to dead letter queue"""
failed_webhook = {
'event': event,
'error': str(error),
'failed_at': datetime.now().isoformat(),
'attempts': get_attempt_count(event['id'])
}

# Add to dead letter queue (sorted set by timestamp)
redis_client.zadd(
'webhook:dead_letter_queue',
{json.dumps(failed_webhook): datetime.now().timestamp()}
)

# Send alert to operations team
send_alert(f"Webhook {event['id']} failed after retries", failed_webhook)

def get_failed_webhooks(limit=100):
"""Retrieve failed webhooks for manual review"""
items = redis_client.zrange(
'webhook:dead_letter_queue',
0,
limit - 1,
withscores=True
)

failed_webhooks = []
for item, score in items:
webhook = json.loads(item)
webhook['failed_timestamp'] = score
failed_webhooks.append(webhook)

return failed_webhooks

def retry_failed_webhook(event_id):
"""Manually retry a failed webhook"""
items = redis_client.zrange('webhook:dead_letter_queue', 0, -1)

for item in items:
webhook = json.loads(item)
if webhook['event']['id'] == event_id:
try:
# Retry processing
process_webhook(webhook['event'])

# Remove from dead letter queue
redis_client.zrem('webhook:dead_letter_queue', item)

print(f"Successfully reprocessed webhook {event_id}")
return True
except Exception as e:
print(f"Retry failed: {e}")
return False

print(f"Webhook {event_id} not found in dead letter queue")
return False
# Ruby - Dead letter queue with Sidekiq
class WebhookDeadLetterJob
include Sidekiq::Job

sidekiq_options queue: 'dead_letter', retry: false

def perform(event, error_message, attempts)
# Store in database for review
FailedWebhook.create!(
event_id: event['id'],
event_key: event['key'],
event_data: event,
error_message: error_message,
attempts: attempts,
failed_at: Time.current
)

# Send alert
WebhookFailureMailer.notify_operations(event, error_message).deliver_now

# Create incident if critical
if critical_event?(event['key'])
PagerDuty.trigger_incident(
title: "Critical webhook failed: #{event['key']}",
details: {
event_id: event['id'],
error: error_message,
attempts: attempts
}
)
end
end

private

def critical_event?(event_key)
['charge.complete', 'refund.create', 'dispute.create'].include?(event_key)
end
end

# Manual retry command
class RetryFailedWebhook
def self.call(failed_webhook_id)
failed_webhook = FailedWebhook.find(failed_webhook_id)

begin
# Retry processing
WebhookProcessor.process(failed_webhook.event_data)

# Mark as resolved
failed_webhook.update!(
resolved: true,
resolved_at: Time.current
)

Rails.logger.info "Successfully reprocessed webhook #{failed_webhook.event_id}"
true
rescue => e
Rails.logger.error "Retry failed: #{e.message}"
false
end
end
end

Logging and Monitoringโ€‹

Comprehensive Webhook Loggingโ€‹

// Node.js - Structured logging with Winston
const winston = require('winston');

const logger = winston.createLogger({
level: 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.json()
),
transports: [
new winston.transports.File({
filename: 'webhooks-error.log',
level: 'error'
}),
new winston.transports.File({
filename: 'webhooks-combined.log'
})
]
});

// Log webhook receipt
logger.info('webhook.received', {
event_id: event.id,
event_key: event.key,
livemode: event.livemode,
ip: req.ip,
received_at: new Date().toISOString()
});

// Log signature verification
logger.info('webhook.signature_verified', {
event_id: event.id,
signature_present: !!signature
});

// Log idempotency check
logger.info('webhook.idempotency_check', {
event_id: event.id,
is_duplicate: isDuplicate,
cached: isProcessed
});

// Log processing start
logger.info('webhook.processing_started', {
event_id: event.id,
event_key: event.key,
started_at: new Date().toISOString()
});

// Log processing completion
logger.info('webhook.processing_completed', {
event_id: event.id,
duration_ms: Date.now() - startTime,
completed_at: new Date().toISOString()
});

// Log errors with context
logger.error('webhook.processing_failed', {
event_id: event.id,
event_key: event.key,
error: error.message,
stack: error.stack,
attempt: attemptCount,
duration_ms: Date.now() - startTime
});

Monitoring Metricsโ€‹

Track key webhook metrics:

// Node.js - Metrics with Prometheus
const prometheus = require('prom-client');

// Define metrics
const webhookReceived = new prometheus.Counter({
name: 'webhook_received_total',
help: 'Total number of webhooks received',
labelNames: ['event_key', 'livemode']
});

const webhookProcessingDuration = new prometheus.Histogram({
name: 'webhook_processing_duration_seconds',
help: 'Webhook processing duration',
labelNames: ['event_key', 'status'],
buckets: [0.1, 0.5, 1, 2, 5, 10]
});

const webhookErrors = new prometheus.Counter({
name: 'webhook_errors_total',
help: 'Total number of webhook processing errors',
labelNames: ['event_key', 'error_type']
});

const webhookDuplicates = new prometheus.Counter({
name: 'webhook_duplicates_total',
help: 'Total number of duplicate webhooks',
labelNames: ['event_key']
});

// Record metrics
app.post('/webhooks/omise', async (req, res) => {
const startTime = Date.now();
const event = JSON.parse(req.body.toString());

// Record receipt
webhookReceived.inc({
event_key: event.key,
livemode: event.livemode
});

res.status(200).json({ received: true });

try {
// Check for duplicate
if (await isEventProcessed(event.id)) {
webhookDuplicates.inc({ event_key: event.key });
return;
}

// Process webhook
await processWebhook(event);

// Record success
webhookProcessingDuration.observe(
{ event_key: event.key, status: 'success' },
(Date.now() - startTime) / 1000
);

} catch (error) {
// Record error
webhookErrors.inc({
event_key: event.key,
error_type: error.constructor.name
});

webhookProcessingDuration.observe(
{ event_key: event.key, status: 'error' },
(Date.now() - startTime) / 1000
);
}
});

// Expose metrics endpoint
app.get('/metrics', (req, res) => {
res.set('Content-Type', prometheus.register.contentType);
res.end(prometheus.register.metrics());
});

Alerting Rulesโ€‹

Set up alerts for webhook issues:

# Prometheus alerting rules
groups:
- name: webhook_alerts
interval: 30s
rules:
# Alert on high error rate
- alert: HighWebhookErrorRate
expr: |
rate(webhook_errors_total[5m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High webhook error rate detected"
description: "Webhook error rate is {{ $value }} errors/sec"

# Alert on processing delays
- alert: SlowWebhookProcessing
expr: |
histogram_quantile(0.95, webhook_processing_duration_seconds_bucket) > 5
for: 10m
labels:
severity: warning
annotations:
summary: "Slow webhook processing detected"
description: "95th percentile processing time is {{ $value }}s"

# Alert on duplicate webhooks
- alert: HighDuplicateWebhookRate
expr: |
rate(webhook_duplicates_total[10m]) > 0.5
for: 10m
labels:
severity: info
annotations:
summary: "High duplicate webhook rate"
description: "Duplicate rate is {{ $value }} duplicates/sec"

Manual Retryโ€‹

Dashboard Manual Retryโ€‹

Use the Omise dashboard to manually retry failed webhooks:

  1. Navigate to Webhooks section
  2. Select your webhook endpoint
  3. View "Recent Deliveries"
  4. Find failed delivery
  5. Click "Resend"

API Manual Retryโ€‹

Retrieve and process failed webhooks via API:

// Node.js - Manual retry via API
const Omise = require('omise');
const omise = Omise({ secretKey: process.env.OMISE_SECRET_KEY });

async function retryFailedWebhooks() {
try {
// Get recent events
const events = await omise.events.list({ limit: 100 });

for (const event of events.data) {
// Check if event needs reprocessing
const needsRetry = await checkIfNeedsRetry(event.id);

if (needsRetry) {
console.log(`Retrying event ${event.id}`);

try {
await processWebhook(event);
console.log(`Successfully processed ${event.id}`);
} catch (error) {
console.error(`Failed to process ${event.id}:`, error);
}
}
}
} catch (error) {
console.error('Error retrieving events:', error);
}
}

async function checkIfNeedsRetry(eventId) {
// Check if event was successfully processed
const processed = await db.webhookEvents.findOne({
event_id: eventId,
status: 'completed'
});

return !processed;
}

// Run manually or on schedule
retryFailedWebhooks();
# Python - Manual retry script
import omise
import asyncio

omise.api_secret = os.environ['OMISE_SECRET_KEY']

async def retry_failed_webhooks():
"""Retrieve and reprocess failed webhooks"""
# Get recent events
events = omise.Event.retrieve()

for event in events['data']:
# Check if needs retry
if await needs_retry(event['id']):
print(f"Retrying event {event['id']}")

try:
await process_webhook(event)
print(f"Successfully processed {event['id']}")
except Exception as e:
print(f"Failed to process {event['id']}: {e}")

async def needs_retry(event_id):
"""Check if event needs reprocessing"""
processed = await db.webhook_events.find_one({
'event_id': event_id,
'status': 'completed'
})

return processed is None

# Run as scheduled job
if __name__ == '__main__':
asyncio.run(retry_failed_webhooks())

Bulk Retry Toolโ€‹

Create a tool for bulk webhook reprocessing:

# Ruby - Bulk retry rake task
namespace :webhooks do
desc "Retry failed webhooks"
task retry_failed: :environment do
failed_webhooks = FailedWebhook.where(resolved: false)
.where('failed_at > ?', 7.days.ago)
.order(failed_at: :desc)

puts "Found #{failed_webhooks.count} failed webhooks to retry"

success_count = 0
failure_count = 0

failed_webhooks.find_each do |failed_webhook|
print "Retrying webhook #{failed_webhook.event_id}... "

begin
WebhookProcessor.process(failed_webhook.event_data)

failed_webhook.update!(
resolved: true,
resolved_at: Time.current
)

puts "โœ“ SUCCESS"
success_count += 1

rescue => e
puts "โœ— FAILED: #{e.message}"
failure_count += 1

# Update attempt count
failed_webhook.increment!(:retry_attempts)
end

# Rate limit retries
sleep 0.5
end

puts "\nRetry Summary:"
puts " Successful: #{success_count}"
puts " Failed: #{failure_count}"
end

desc "Retry specific webhook by event ID"
task :retry_by_id, [:event_id] => :environment do |t, args|
failed_webhook = FailedWebhook.find_by(event_id: args[:event_id])

unless failed_webhook
puts "Webhook #{args[:event_id]} not found in failed webhooks"
exit 1
end

begin
WebhookProcessor.process(failed_webhook.event_data)
failed_webhook.update!(resolved: true, resolved_at: Time.current)
puts "Successfully reprocessed webhook #{args[:event_id]}"
rescue => e
puts "Failed to reprocess: #{e.message}"
exit 1
end
end
end

Real-World Scenariosโ€‹

E-commerce High Volumeโ€‹

Handle high webhook volume with queue-based processing:

// Node.js - Queue-based processing with Bull
const Queue = require('bull');
const webhookQueue = new Queue('webhooks', {
redis: { host: 'localhost', port: 6379 }
});

// Configure queue
webhookQueue.process(10, async (job) => {
const event = job.data;

try {
// Process with idempotency
await processWebhookIdempotent(event);
return { status: 'success', event_id: event.id };
} catch (error) {
console.error(`Processing error for ${event.id}:`, error);
throw error; // Bull will handle retries
}
});

// Queue failed job handler
webhookQueue.on('failed', async (job, err) => {
console.error(`Webhook job failed: ${job.id}`, err);

// After max retries, move to dead letter queue
if (job.attemptsMade >= job.opts.attempts) {
await queueForManualReview(job.data, err);
}
});

// Endpoint adds to queue
app.post('/webhooks/omise', async (req, res) => {
if (!verifySignature(req.body, req.headers['x-omise-signature'])) {
return res.status(401).json({ error: 'Invalid signature' });
}

const event = JSON.parse(req.body.toString());

// Add to queue with retry config
await webhookQueue.add(event, {
attempts: 5,
backoff: {
type: 'exponential',
delay: 2000
},
removeOnComplete: true,
removeOnFail: false
});

res.status(200).json({ received: true });
});

Subscription Service Reliabilityโ€‹

Ensure reliable subscription processing:

# Python - Reliable subscription webhook processing
from celery import Task
from sqlalchemy.exc import IntegrityError

class IdempotentTask(Task):
"""Base task with built-in idempotency"""
autoretry_for = (Exception,)
retry_kwargs = {'max_retries': 5}
retry_backoff = True
retry_backoff_max = 600
retry_jitter = True

@celery_app.task(base=IdempotentTask, bind=True)
def process_subscription_webhook(self, event):
"""Process subscription-related webhook with retries"""
event_id = event['id']

# Idempotency check
if is_event_processed(event_id):
logger.info(f'Event {event_id} already processed')
return {'status': 'duplicate', 'event_id': event_id}

try:
# Process based on event type
if event['key'] == 'charge.complete':
result = extend_subscription(event['data'])
elif event['key'] == 'charge.failed':
result = handle_payment_failure(event['data'])
elif event['key'] == 'schedule.expiration.close':
result = expire_subscription(event['data'])

# Mark as processed
mark_event_processed(event_id)

return result

except IntegrityError as e:
# Duplicate processing attempt
logger.warning(f'Duplicate processing attempt for {event_id}')
return {'status': 'duplicate', 'event_id': event_id}

except Exception as e:
logger.error(f'Error processing {event_id}: {e}')

# Check retry count
if self.request.retries >= self.max_retries:
# Max retries exceeded, move to dead letter queue
queue_for_manual_review(event, str(e))

raise # Trigger Celery retry

def extend_subscription(charge):
"""Extend subscription after successful payment"""
with db.transaction():
subscription = Subscription.query.filter_by(
charge_id=charge['id']
).with_for_update().first()

if subscription.next_billing_date > datetime.now():
# Already extended
return {'status': 'already_extended'}

subscription.extend(days=30)
subscription.last_payment = datetime.now()
db.session.commit()

# Send confirmation (idempotent - checks if already sent)
send_renewal_confirmation(subscription)

return {'status': 'success', 'subscription_id': subscription.id}

Best Practicesโ€‹

Webhook Processing Checklistโ€‹

  • Respond within 10 seconds - Return 200 immediately
  • Implement idempotency - Use event ID tracking or database constraints
  • Process asynchronously - Use queues or background jobs
  • Log comprehensively - Track all webhook events and outcomes
  • Monitor metrics - Track error rates, latency, duplicates
  • Handle duplicates gracefully - Don't treat as errors
  • Implement retry logic - For transient failures in processing
  • Use dead letter queue - For failed webhooks after max retries
  • Set up alerting - For critical failures or high error rates
  • Test failure scenarios - Ensure graceful degradation
  • Document processes - For manual intervention procedures

Performance Optimizationโ€‹

// Optimize webhook processing performance
const webhookHandlers = {
'charge.complete': handleChargeComplete,
'charge.failed': handleChargeFailed,
'refund.create': handleRefundCreate,
// ... more handlers
};

async function processWebhook(event) {
const handler = webhookHandlers[event.key];

if (!handler) {
console.log(`No handler for event type: ${event.key}`);
return;
}

// Execute handler with timeout
await Promise.race([
handler(event.data),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Handler timeout')), 30000)
)
]);
}

// Batch database operations
async function handleChargeComplete(charge) {
const operations = [
updateOrderStatus(charge),
reduceInventory(charge),
recordTransaction(charge)
];

// Execute in parallel
await Promise.all(operations);

// Send notifications after core operations
await sendNotifications(charge);
}

Troubleshootingโ€‹

Webhook Not Retryingโ€‹

Problem: Webhooks not being retried by Omise

Solutions:

  • Check response status code (must be non-2xx to trigger retry)
  • Verify endpoint responds within 10 seconds
  • Check Omise dashboard for delivery logs
  • Ensure endpoint is accessible from internet

Duplicate Processingโ€‹

Problem: Webhooks processed multiple times

Solutions:

  • Implement event ID tracking
  • Use database unique constraints
  • Add idempotency checks to business logic
  • Log duplicate detection for monitoring

Processing Delaysโ€‹

Problem: Webhook processing takes too long

Solutions:

  • Move processing to background queue
  • Optimize database queries
  • Batch operations where possible
  • Scale horizontally with more workers
  • Monitor and profile slow operations

Lost Webhooksโ€‹

Problem: Missing webhook deliveries

Solutions:

  • Check webhook configuration in dashboard
  • Verify endpoint uptime and accessibility
  • Review server logs for unhandled errors
  • Implement webhook polling as backup
  • Set up monitoring and alerting
// Backup polling mechanism
async function pollMissingWebhooks() {
// Get last processed event timestamp
const lastProcessed = await getLastProcessedTimestamp();

// Query events since last processed
const events = await omise.events.list({
from: lastProcessed,
limit: 100
});

for (const event of events.data) {
// Check if event was processed
const processed = await isEventProcessed(event.id);

if (!processed) {
console.log(`Found missing event: ${event.id}`);
await processWebhook(event);
}
}
}

// Run periodically
setInterval(pollMissingWebhooks, 15 * 60 * 1000); // Every 15 minutes

FAQโ€‹

What happens if my server is down when a webhook is sent?โ€‹

Omise will retry the webhook according to the retry schedule for up to 7 days. Ensure you have monitoring to detect extended downtime.

Can I control the retry schedule?โ€‹

No, the retry schedule is managed by Omise. However, you can implement your own retry logic for internal processing failures.

How do I handle webhooks that arrive out of order?โ€‹

Use the created_at timestamp to determine event sequence. Store timestamps and process events based on chronological order, not arrival order.

Should I return an error if webhook processing fails internally?โ€‹

No, return 200 OK immediately and handle internal failures separately. Returning errors causes Omise to retry, which may not help if the issue is in your code.

How long should I store processed event IDs?โ€‹

Store for at least 7 days (Omise retry period). Recommended retention is 30 days for better duplicate detection.

What if I need to reprocess old webhooks?โ€‹

Retrieve events via the Omise API and reprocess manually. Use the Events API to fetch historical events by date range.

How do I test retry behavior?โ€‹

Return non-200 status codes in test mode to trigger retries. Monitor the dashboard to see retry attempts.

Can I manually trigger webhook retries from code?โ€‹

Yes, retrieve the event via API and process it manually. See the Manual Retry section for examples.

How do I handle webhooks during deployment?โ€‹

Use zero-downtime deployments with health checks. Alternatively, disable webhooks during maintenance and manually reprocess missed events afterward.

What's the maximum number of retry attempts?โ€‹

Omise retries for up to 7 days with exponential backoff. The exact number of attempts varies based on the retry schedule.

Next Stepsโ€‹

After implementing reliable webhook processing:

  1. Test failure scenarios - Simulate errors and verify retry behavior
  2. Monitor webhook health - Set up dashboards and alerts
  3. Implement dead letter queue - Handle permanently failed webhooks
  4. Document procedures - Create runbooks for manual intervention
  5. Optimize performance - Profile and improve processing speed