Hey guys, In this blog we will see how we can schedule a Python Script in AWS Glue as a Job that will run every hour. I have tried to make this tutorial as easy as possible with each and every step explained.
So without any further due, let’s do it…
Step 1 – Search and Open AWS Glue in your AWS account
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2022/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job.webp)
Step 2 – Open Visual ETL
- In the left sidebar, we can see Visual ETL.
- Click on that and you will see a page with all the jobs listed.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2023/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job-visual-etl-page.webp)
Step 3 – Add Job
- Select the type of job you want to create.
- I will select Python Shell Script Editor, with ‘Create a new script with boilerplate code’ selected.
- Click on the Create button.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2023/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job-create-job.webp)
Step 4 – Configure your Job
- It will open a page like below.
- Now you need to do some configurations.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2023/09/image-3-1024x422.png)
- Name your Job.
- And choose an IAM Role.
- Rest keep everything as it is and click on the Save button.
- Following is the demo code that I wrote to check my Glue Job.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2022/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job6.webp)
- Now you can see that I imported pandas and numpy in my code and these are not already present in the Glue environment.
- So now we will add these libraries to our environment.
Steps to include external libraries:
- Open Job details.
- Scroll down and click on Advanced Properties.
- Scroll down and under Job Parameters click on ‘Add new parameter’.
- Under Key add
--additional-python-modules
and under Value add comma-separated libraries. - Click on Save.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2022/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job7.webp)
Step 6 – Let’s schedule a Python Script in AWS Glue as a Job
- Click on Schedules.
- Click on Create Schedule.
- Add a Name, and create a schedule.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2022/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job8.webp)
Step 7 – Let’s run it
- Click on Run and it will run your Job.
- And it should run successfully.
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2022/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job10.webp)
You can also see All Logs, Output Logs, and Error Logs on this page.
Output Logs
![schedule a Python Script in AWS Glue as a Job](https://machinelearningprojects.net/wp-content/uploads/2022/09/schedule-a-Python-Script-in-AWS-Glue-as-a-Job11-1.webp)
You can see the messages here that we printed from our code.
And this is how you can schedule a Python Script in AWS Glue as a Job.
So this is all for this blog folks, thanks for reading it and I hope you are taking something with you after reading this and till the next time ?…
Read my previous post: Easiest Way to use an Amazon S3 trigger to invoke a Lambda function
Check out my other machine learning projects, deep learning projects, computer vision projects, NLP projects, Flask projects at machinelearningprojects.net