An Update to my first Python Script

Nothing can ever really be considered done when you're talking about programming, right?

I decided to try and add images to the python script I wrote last week and was able to do it, with not too much hassel.

The first thing I decided to do was to update the code on pythonista on my iPad Pro and verify that it would run.

It took some doing (mostly because I forgot that the attributes in an img tag included what I needed ... initially I was trying to programmatically get the name of the person from the image file itelf using regular expressions ... it didn't work out well).

Once that was done I branched the master on GitHub into a development branch and copied the changes there. Once that was done I performed a pull request on the macOS GitHub Desktop Application.

Finally, I used the macOS GitHub app to merge my pull request from development into master and now have the changes.

The updated script will now also get the image data to display into the multi markdown table:

| Name | Title | Image |
| --- | --- | --- |
|Mike Cheley|CEO/Creative Director|![alt text](https://www.graphtek.com/user_images/Team/Mike_Cheley.png "Mike Cheley")|
|Ozzy|Official Greeter|![alt text](https://www.graphtek.com/user_images/Team/Ozzy.png "Ozzy")|
|Jay Sant|Vice President|![alt text](https://www.graphtek.com/user_images/Team/Jay_Sant.png "Jay Sant")|
|Shawn Isaac|Vice President|![alt text](https://www.graphtek.com/user_images/Team/Shawn_Isaac.png "Shawn Isaac")|
|Jason Gurzi|SEM Specialist|![alt text](https://www.graphtek.com/user_images/Team/Jason_Gurzi.png "Jason Gurzi")|
|Yvonne Valles|Director of First Impressions|![alt text](https://www.graphtek.com/user_images/Team/Yvonne_Valles.png "Yvonne Valles")|
|Ed Lowell|Senior Designer|![alt text](https://www.graphtek.com/user_images/Team/Ed_Lowell.png "Ed Lowell")|
|Paul Hasas|User Interface Designer|![alt text](https://www.graphtek.com/user_images/Team/Paul_Hasas.png "Paul Hasas")|
|Alan Schmidt|Senior Web Developer|![alt text](https://www.graphtek.com/user_images/Team/Alan_Schmidt.png "Alan Schmidt")|

Which gets displayed as this:

Name Title Image


Mike Cheley CEO/Creative Director alt text Ozzy Official Greeter alt text Jay Sant Vice President alt text Shawn Isaac Vice President alt text Jason Gurzi SEM Specialist alt text Yvonne Valles Director of First Impressions alt text Ed Lowell Senior Designer alt text Paul Hasas User Interface Designer alt text Alan Schmidt Senior Web Developer alt text

My First Python Script that does 'something'

I've been interested in python as a tool for a while and today I had the chance to try and see what I could do.

With my 12.9 iPad Pro set up at my desk, I started out. I have Ole Zorn's Pythonista 3 installed so I started on my first script.

My first task was to scrape something from a website. I tried to start with a website listing doctors, but for some reason the html rendered didn't include anything useful.

So the next best thing was to find a website with staff listed on it. I used my dad's company and his staff listing as a starting point.

I started with a quick Google search to find Pythonista Web Scrapping and came across this post on the Pythonista forums.

That got me this much of my script:

import bs4, requests

myurl = 'http://www.graphtek.com/Our-Team'

def get_beautiful_soup(url):

return bs4.BeautifulSoup(requests.get(url).text, "html5lib")

soup = get_beautiful_soup(myurl)

Next, I needed to see how to start traversing the html to get the elements that I needed. I recalled something I read a while ago and was (luckily) able to find some help.

That got me this:

tablemgmt = soup.findAll('div', attrs={'id':'our-team'})

This was close, but it would only return 2 of the 3 div tags I cared about (the management team has a different id for some reason ... )

I did a search for regular expressions and Python and found this useful stackoverflow question and saw that if I updated my imports to include re then I could use regular expressions.

Great, update the imports section to this:

import bs4, requests, re

And added re.compile to my findAll to get this:

tablemgmt = soup.findAll('div', attrs={'id':re.compile('our-team')})

Now I had all 3 of the div tags I cared about.

Of course the next thing I wanted to do was get the information i cared out of the structure tablemgmt.

When I printed out the results I noticed leading and trailing square brackets and eveytime I tried to do something I'd get an error.

It took an embarrassingly long time to realize that I needed to treat tablemgmt as an array. Whoops!

Once I got through that it was straight forward to loop through the data and output it:

list_of_names = []

for i in tablemgmt:

for row in i.findAll('span', attrs={'class':'team-name'}):

text = row.text.replace('<span class="team-name"', '')

if len(text)>0:

list_of_names.append(text)

list_of_titles = []

for i in tablemgmt:

for row in i.findAll('span', attrs={'class':'team-title'}):

text = row.text.replace('<span class="team-title"', '')

if len(text)>0:

list_of_titles.append(text)

The last bit I wanted to do was to add some headers and make the lists into a two column multimarkdown table.

OK, first I needed to see how to 'combine' the lists into a multidimensional array. Another google search and ... success. Of course the answer would be on stackoverflow

With my knowldge of looping through arrays and the function zip I was able to get this:

for j, k in zip(list_of_names, list_of_titles):

print('|'+ j + '|' + k + '|')

Which would output this:

|Mike Cheley|CEO/Creative Director|

|Ozzy|Official Greeter|

|Jay Sant|Vice President|

|Shawn Isaac|Vice President|

|Jason Gurzi|SEM Specialist|

|Yvonne Valles|Director of First Impressions|

|Ed Lowell|Senior Designer|

|Paul Hasas|User Interface Designer|

|Alan Schmidt|Senior Web Developer|

This is close, however, it still needs headers.

No problem, just add some static lines to print out:

print('| Name | Title |')
print('| --- | --- |')

And voila, we have a multimarkdown table that was scrapped from a web page:

| Name | Title |
| --- | --- |
|Mike Cheley|CEO/Creative Director|
|Ozzy|Official Greeter|
|Jay Sant|Vice President|
|Shawn Isaac|Vice President|
|Jason Gurzi|SEM Specialist|
|Yvonne Valles|Director of First Impressions|
|Ed Lowell|Senior Designer|
|Paul Hasas|User Interface Designer|
|Alan Schmidt|Senior Web Developer|

Which will render to this:

Name Title


Mike Cheley CEO/Creative Director Ozzy Official Greeter Jay Sant Vice President Shawn Isaac Vice President Jason Gurzi SEM Specialist Yvonne Valles Director of First Impressions Ed Lowell Senior Designer Paul Hasas User Interface Designer Alan Schmidt Senior Web Developer


Page 13 / 13