OpenQuestoes

Job detail

INGEST_FILE

ID
4e1381d9-509b-4382-9f9e-452a62d90d41
Status
FAILED_FINAL
Entity
discovered_links / 5b93f45d-8a81-4621-a576-ddc721a353e4
Attempts
3 / 3
Created
Scheduled
Locked by
Error
ValueError: Downloaded file is not a PDF: content_type='text/html' url=https://conhecimento.fgv.br
Runtime settings
max attempts 3, lock 300s, retry backoff 60s
Reference
Open referenced entity
{'url': 'https://conhecimento.fgv.br'}

Logs

CreatedLevelMessageContext
ERROR Job failed
{'error': "ValueError: Downloaded file is not a PDF: content_type='text/html' url=https://conhecimento.fgv.br", 'retryable': False, 'traceback': 'Traceback (most recent call last):\n  File "/app/jobs/opq_worker/runner.py", line 62, in run_once\n    handler(job, self.context)\n  File "/app/jobs/opq_worker/handlers.py", line 103, in handle_ingest_file\n    raise ValueError(f"Downloaded file is not a PDF: content_type={content_type!r} url={url}")\nValueError: Downloaded file is not a PDF: content_type=\'text/html\' url=https://conhecimento.fgv.br\n'}
INFO Job started
{'attempt': 3, 'job_type': 'INGEST_FILE', 'entity_id': '5b93f45d-8a81-4621-a576-ddc721a353e4', 'worker_id': '293584ae7d10:281473427816864', 'entity_type': 'discovered_links', 'max_attempts': 3}
ERROR Job failed
{'error': "ValueError: Downloaded file is not a PDF: content_type='text/html' url=https://conhecimento.fgv.br", 'retryable': True, 'traceback': 'Traceback (most recent call last):\n  File "/app/jobs/opq_worker/runner.py", line 62, in run_once\n    handler(job, self.context)\n  File "/app/jobs/opq_worker/handlers.py", line 103, in handle_ingest_file\n    raise ValueError(f"Downloaded file is not a PDF: content_type={content_type!r} url={url}")\nValueError: Downloaded file is not a PDF: content_type=\'text/html\' url=https://conhecimento.fgv.br\n'}
INFO Job started
{'attempt': 2, 'job_type': 'INGEST_FILE', 'entity_id': '5b93f45d-8a81-4621-a576-ddc721a353e4', 'worker_id': '293584ae7d10:281473427816864', 'entity_type': 'discovered_links', 'max_attempts': 3}
ERROR Job failed
{'error': "ValueError: Downloaded file is not a PDF: content_type='text/html' url=https://conhecimento.fgv.br", 'retryable': True, 'traceback': 'Traceback (most recent call last):\n  File "/app/jobs/opq_worker/runner.py", line 62, in run_once\n    handler(job, self.context)\n  File "/app/jobs/opq_worker/handlers.py", line 103, in handle_ingest_file\n    raise ValueError(f"Downloaded file is not a PDF: content_type={content_type!r} url={url}")\nValueError: Downloaded file is not a PDF: content_type=\'text/html\' url=https://conhecimento.fgv.br\n'}
INFO Job started
{'attempt': 1, 'job_type': 'INGEST_FILE', 'entity_id': '5b93f45d-8a81-4621-a576-ddc721a353e4', 'worker_id': '293584ae7d10:281473427816864', 'entity_type': 'discovered_links', 'max_attempts': 3}