Easiest way to schedule a Python Script in AWS Glue as a Job – 2023

Machine Learning Projects

Hey guys, In this blog we will see how we can schedule a Python Script in AWS Glue as a Job that will run every hour. I have tried to make this tutorial as easy as possible with each and every step explained.

So without any further due, let’s do it…

Step 1 – Search and Open AWS Glue in your AWS account

schedule a Python Script in AWS Glue as a Job

Step 2 – Open Jobs from Legacy Pages.

  • In the left sidebar, we can see Legacy Pages.
  • Click on that and open Jobs from there.
schedule a Python Script in AWS Glue as a Job

Step 3 – Add Job

Click on Add Job.

schedule a Python Script in AWS Glue as a Job

Step 4 – Configure your Job

schedule a Python Script in AWS Glue as a Job
  • Name your Job.
  • Change your bucket where Glue will store your Python Script and temporary files.
  • An finally choose an IAM Role.
  • Rest keep everything as it is.
  • Scroll down and click on Next.
  • A screen like the one below will pop up and ask you to make connections. My Python Script doesn’t require any connection, so I will not select any of the connections.
  • You can click on Save job and edit script.
schedule a Python Script in AWS Glue as a Job

Step 5 – Let’s add our Python code

Now in the left menu bar click on the Jobs(new) which will open up a console where we can add our code and schedule it later.

schedule a Python Script in AWS Glue as a Job

A screen like the one below will open where you need to select your Glue Job.

schedule a Python Script in AWS Glue as a Job
  • Once you click on your Job, a code editor will open where you need to paste the Python Code that you want to schedule.
  • Following is the demo code that I wrote to check my Glue Job.
schedule a Python Script in AWS Glue as a Job
  • Now you can see that I imported pandas and numpy in my code and these are not already present in the Glue environment.
  • So now we will add these libraries to our environment.

Steps to include external libraries:

  • Open Job details.
  • Scroll down and click on Advanced Properties.
  • Scroll down and under Job Parameters click on ‘Add new parameter’.
  • Under Key add ‘--additional-python-modules‘ and under Value add comma-separated libraries.
  • Click on Save.
schedule a Python Script in AWS Glue as a Job

Step 6 – Let’s schedule a Python Script in AWS Glue as a Job

  • Click on Schedules.
  • Click on Create Schedule.
  • Add a Name, and create a schedule.
schedule a Python Script in AWS Glue as a Job
Schedule a Python Script in AWS Glue as a Job

Step 7 – Let’s run it

  • Click on Run and it will run your Job.
  • And it should run successfully.
schedule a Python Script in AWS Glue as a Job

You can also see All Logs, Output Logs, and Error Logs on this page.

Output Logs

schedule a Python Script in AWS Glue as a Job

You can see the messages here that we printed from our code.

And this is how you can schedule a Python Script in AWS Glue as a Job.

So this is all for this blog folks, thanks for reading it and I hope you are taking something with you after reading this and till the next time ?…

Read my previous post: Easiest Way to use an Amazon S3 trigger to invoke a Lambda function

Check out my other machine learning projectsdeep learning projectscomputer vision projectsNLP projectsFlask projects at machinelearningprojects.net

Leave a Comment

Your email address will not be published. Required fields are marked *