I got this amazing opportunity to participate in GSoC’20 with the organisation “MacPorts”. Last year I successfully participated in GSoC’19 under the same organisation. My report from GSoC’19 can be seen here.

This year, my project was to do some significant enhancements to the macports-webapp (ports.macports.org). I was mentored by Mojca Miklavec, Amar Takhar and Rajdeep Bharati and got to learn a big deal from them. The process this year was more than just coding things away, we had to do a lot of brainstorming to optimise the website, take decisions regarding the flow and prioritise the features. Amar once mentioned it to be more of a Case study, and I can’t agree more.

I will first list the major goals briefly and then we will take a look at them in detail later in the post.

Goals

  • Code refactoring: Simplifying and improving the codebase by splitting into smaller modules. This idea is make maintenance and new contributions easier.
  • Site acceleration: Improving the database queries and carefully caching some parts of the website.
  • Advanced Search: Adding a comprehensive search engine to the website to make port discovery easier by giving advanced filters and options to narrow down the results.
  • Watchlist & “Maintained by me”: Allowing login/signup functionality on the website, giving the ability to follow ports and create watchlist. Users can get a quick overview of the ports they follow and they maintain. The process of identifying maintained ports is automatic using GitHub profiles and emails.
  • Notifications: Users can receive notifications for followed ports when certain evens are triggered (port updates).
  • LiveCheck: Displaying livecheck results for ports on the website itself.
  • A refreshed UI: A lot of enhancements have been done to the UI starting from a new homepage to basic port page with installation instructions (kind of a splash screen).
  • Chart.JS and asynchronous loading: Dropping dependency on Google Charts and migrating all the charts to Chart.JS. The charts now load asynchronously so that website does not slow down due the underlying complex queries that build the charts.
  • REST API: Writing a comprehensive REST API allowing to extend the app to better frontends and be useful to developers building tools based on this information.

and a lot of other improvements…

Completed work

At the time of writing this post, majority of the work done by me during GSoC’20 is comprised in the gsoc2020 branch (this direct link might not work at later point in time when this branch gets merged into the master and probably deleted) of the repository. I have made some 223 commits.

Other than this, some small changes were done to the following:


Let’s see everything in action

We will now take a look at all the new exciting features in detail, I will include screenshots wherever possible.

A better codebase and project structure

We moved away from the traditional Django directory structure in which settings, wsgi and base urls.py are located in the sub-directory (named the same as the project). We brought the contents of this directory into the root and got rid of it completely. This gives more of a natural feel now.

Then, the project was split into several and smaller django apps. Most of the functionalities are now isolated as well structured into smaller modules. This allows easer maintenance.

Once the project was structured, I worked on actually improving the code. In this process I managed to reduce the code by around 30% and improved the database queries to run much faster. I also have more confidence in the code now and there are very low chances of things going wrong now. To mention some of the major improvements specifically:

  • A single class method of the model Port now takes care of adding initial data, doing incremental updates and full updates all by itself. Less code to manage.
  • The script responsible for getting updated from paths from git: git_update.py has been completely re-written to avoid recursion and exceptions that could cause it to break.
  • All ther queries for calculating statistics have been improved and now run much faster than before.

The search engine (based on Solr)

A dedicated search page, located at /search now provides some very important filters and options to narrow down search results and discover exactly what the user needs.

To list the features:

  • Filter by the files installed by ports
  • Filter by maintainer, category or variant (simultaneously)
  • Filter by port version status (up-to-date, outdated, livecheck is broken)
  • Show deleted ports
  • Ranked search results when matching only by port name
  • Suggestions for maintainers, categories and variants based on current results

This advanced search page has made the separate category, variant and maintainer pages obsolete- giving us another opportunity to reduce code.

Along with the search page, a search bar has also been added to each page using Twitter Typeahead which autocompletes the results. This box basically shows same results as the search page when the “Only match by port names” option is selected, allowing to rank the search results.


Livecheck and local MacPorts installation

A lot of new functionality could be enable on the website only because of the local MacPorts that now runs on the server itself. This has allowed us to achieve the following:

  • Livecheck results for all ports obtained by running: port livecheck port-name
  • Updates to the ports table can now be made as fast as real-time because the PortIndex is now generated locally and not fetched from the rsync server. But generating a local Portindex can give inconsistent results because of the server-side platform. However, the portindex command provide the ability to fake the platform and hence we generate the portindex for the latest MacOS version.

Splash screen and new port’s page

A simplified page now sits on top of the port page which is full of technical information. Since, this information is not useful to a general user, they only see a simple page which instructions on how to install the port and its description.

However, developers can by-pass this page by making the detailed ports page as default.

The look and feel of the ports page has been completely changed to include more information like port notes, variant description, subports and OS Name for Port health.


New charts based on Chart.JS and asynchronous loading

We dropped Google Charts and opted for Chart.JS that removed the need of being dependent on any CDN. The new Chart.JS based charts use API with improved queries to fetch data and then process the data (so that it is fit for the charts) using JavaScript, thus reducing the server-side load.

The charts load asynchronously which provides great user experience regardless of the noticeable time taken to build them.

The duration charts are cached for one hour and monthly charts are cached for 24 hours using Memcached- boosting the website performance.


Watchlist and Notifications

Users can now login using their email addresses or easily using GitHub login. Once logged in, they can follow ports and receive notifications when these ports get updates/ changed. They can also have a quick overview of these ports on a single page.

Maintainers can also have the overview of the ports maintained by them. This matching is done using their emails addresses and GitHub accounts. A user can add any number of emails and GitHub handles to an account.

The ports can also be directly followed from the search page by clicking the + button in front of the port name. This uses AJAX and the page does not refresh.


Other UI Enhancements

Dark Mode


New landing page


New filters for build page: New filters have been used for the build page, which allow multiple-select functionality.


The comprehensive REST API

A complete REST API has been built using Django Rest Framework, it exposes endpoints for all the information that the app gathers and provides some amazing methods to narrow down the data. The API comes with list-views, detail-views, filters, search and sort functionalities which make it super-easy to find information stored in any corner of the app.

The API even exposed the autocomplete search functionality obtained using Solr.


I have tried to include the highlights of the project in this blog post. I had a great learning experience while taking the site from scratchy to a better-polished version. I learnt some advanced new technologies, thanks to Amar Takhar for all the help that was needed while getting accustomed to these technologies.

This GSoC was not a regular GSoC. My mentors allowed me to adjust the timeline according to my college-schedule, I was given more-than-regular freedom in taking project related decisions and as I already mentioned- this time it was not just about building something that does the job. But it was much more than that- it was about optimisation, better experience, neat and easy to maintain codebase.

I hope the new webapp turns out to be useful for the MacPorts Project and I get an opportunity to give back. Two years with MacPorts have been full of learning and challenges.

A huge thanks to my mentors, org admins, MacPorts organisation and Google Summer of Code!