Huidong Tian's Blog

Don't know? Google it!

Automatic Notice When Vacancy Available

Today, I visited a webpage inadvertently and found several job positions that I am competent with, unfortunately all of them has expired. How many chances we lost in this way?! So I decide to do somthing to limit this kind of loss, and of course using our smart R!

The idea is simple: check the job vacancy webpages reguarly, if find some positions open the webpages or/and send an notice to my email.

Let’s take the vacancy page of Department of Biosciences, UiO as an example. The webpage contains the positions have not expired, for this kind of webpage, we can use the following code:

Download Webpage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
workspace <- "C:/Users"
file_outdate <- paste(workspace, "outdate.html", sep = "/")
file_updated <- paste(workspace, "updated.html", sep = "/")
URL <- "http://www.mn.uio.no/ibv/english/about/vacancies/"

if (file.exists(file_outdate)) {
  download.file(URL, file_updated)
} else {
  download.file(URL, file_outdate)
  download.file(URL, file_updated)
}

html_outdate <- readLines(file_outdate)
html_updated <- readLines(file_updated)
Extract Position Titles
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Items <- function(str = html_outdate) {
  # Regular expression;
  ptn <- "item-title.+?>(.+?)</a>"
  HTML_Date <- grep(ptn, str, value = TRUE)
  # First time to use sapply by setting FUN as "[", cool!
  sapply(regmatches(HTML_Date, regexec(ptn, HTML_Date)), "[", 2)
}

# New position available or not;
boo <- any(!Items(str = html_updated) %in% Items(str = html_outdate))

# Remove the html file out of date;
file.remove(file_outdate)
file.rename(file_updated, file_outdate)
Display and Send Email
1
2
3
4
5
if (boo) {
  browseURL(URL)
  library(mail) # Need to install this package first;
  sendmail("you@gmail.com", subject= "Vancancy", message = URL)
}

The difficult part is to assemble the regular expression, and I have writen a tutorial on that topic. The last step is to run above code in a batch mode.

Comments