IB Version: Applicable for all versions
Issue:
Flow failed to execute.
Debugging Steps:
1. Try opening the Flow Results/Logs and check the error it shows.
2. The failure can be due to multiple reasons like OCR related issues, refiner related issues, UDF failure etc.
2. If it is a refiner related error, please open the refiner and try running the field to check for any slowness. You can check article for more detailed steps.
3. If it is a model related failure, please check this article for debugging steps.
4. If its related to OCR, check if we are able to process similar documents successfully.
- If you notice a timeout error in flow results, it could be either that OCR was not able to process the request in time OR the request did not get picked for processing due to load.
- Points to check:
- If there are lot of flows running parallelly?
- Are we able to process similar files successfully?
- Is the corresponding OCR(Msft, Abby ..) running fine? Check for any restarts due to Memory issue(exit code: 137, Reason: OOM). If yes, check the stats using Grafana or any similar tool and adjust the memory as needed.
5. Check for restarts in any of the services(celery-app-tasks, ocr-service, file-tservice, ocr-msft, api-server-apps, api-server, server-nginx). If any restarts, check the describe output of the restarted pod. If its due to OOMKilled try increasing the memory and see if it helps. If it does not help or failure due to any other reason, please attach the describe output to the ticket.
6. If failures are related to any UDF or pre-flow/post-flow, please add more logging to the scripts to pin-point the failure step.
If above steps does not help, please raise a ticket with below details:
- Steps in the flow.
- Flow logs and trace from dashboard.
- Logs from following services:
- celery-app-tasks
- api-server-apps/api-server
- job-service
- model-service(if flow is utilising models)
- file-tservice(grpc-file-service)