This project uses Puppeteer to scrape data from GitHub and provides an easy-to-use API to access GitHub profile data and repository details.
Clone the repository and install the dependencies with the following commands:
git clone https://github.com/Aitaneuh/github-scraper-api.git
cd github-scraper-api
npm install
- Start the server with the following command:
node src/server.js
-
Once the server is running, you can query the API to retrieve profile or repository information.
-
When sending requests, you can watch the console. It should log what happens in real time.
Access a GitHub user's profile by sending a GET request to the following URL:
http://localhost:4000/github/profile/:username
Replace :username
with the GitHub username you want to retrieve information about.
Example:
GET http://localhost:4000/github/profile/Aitaneuh
Response (Example):
{
"name": "Aitaneuh",
"bio": "I am learning computer science and software development. I live in Switzerland.",
"followers": "4",
"following": "9",
"reposCount": "7",
"avatarUrl": "https://avatars.githubusercontent.com/u/130589742?v=4",
"org": "none",
"location": "none",
"currentTime": "20:58 (UTC +01:00)",
"links": [
"https://x.com/Aitaneuh"
],
"repositories": [
{
"name": "github-scraper-api",
"description": "A small project that creates a little API to get the data of a GitHub profile.",
"stars": "0",
"language": "JavaScript"
}
]
}
Retrieve details of a specific repository by sending a GET request to:
http://localhost:4000/github/repository/:username/:repo
Replace :username
with the GitHub username and :repo
with the repository name.
Example:
GET http://localhost:4000/github/repository/Aitaneuh/github-scraper-api
Response (Example):
{
"name": "github-scraper-api",
"description": "A small project that creates a little API to get the data of a GitHub profile.",
"commits": "3",
"stars": "0",
"forks": "0",
"languages": ["JavaScript"],
"contributors": "Not Found"
}
Retrieve all public repositories of a user by sending a GET request to:
http://localhost:4000/github/all-repositories/:username
Replace :username
with the GitHub username.
Example:
GET http://localhost:4000/github/all-repositories/Aitaneuh
Response (Example):
[
{
"name": "github-scraper-api",
"description": "A small project that creates a little API to get the data of a GitHub profile.",
"stars": "0",
"language": "JavaScript"
}
]
If you prefer to run this API inside a Docker container instead of running it locally, you can use the prebuilt image available on Docker Hub. The image is tagged as aitaneuh/scraper-api
, and it can be pulled and run with Docker.
To pull the Docker image, use the following command:
docker pull aitaneuh/scraper-api:latest
After pulling the image, you can run the Docker container with the following command:
docker run -p 4000:4000 aitaneuh/scraper-api:latest
This will start the API inside a Docker container, and you can access it at http://localhost:4000
.
The container includes a basic health check to ensure it is running correctly. You can check the health status by sending a GET request to:
http://localhost:4000/health
If the container is healthy, it will respond with a ok
status.
Response (Example):
{
"status": "ok",
"timestamp": "2025-03-10T20:56:05.269Z",
"uptime": "00H 12M 42S",
"version": "1.3.0"
}
Retrieve information about the project in general by sending a GET request to:
http://localhost:4000/info
Example:
GET http://localhost:4000/info
Response (Example):
{
"creator": "Aitaneuh",
"description": "This API is made to give an easy access to data from github or maybe others in the future",
"started_on": "04.03.2025",
"github_repo": "https://github.com/Aitaneuh/github-scraper-api",
"docker_image": "aitaneuh/scraper-api:latest",
"repo_stats": "http://localhost:4000/github/repository/Aitaneuh/github-scraper-api"
}
Retrieve only the uptime of the api in milliseconds by sending a GET request to:
http://localhost:4000/uptime
Example:
GET http://localhost:4000/uptime
Response (Example):
{
"uptime": 10935,
}
This for example was an uptime of almost 11 seconds.
server.js
: Contains the Express server code and API routes.profile.js
: Fetches GitHub profile data.repositories.js
: Fetches a list of repositories of a user.repositoryDetails.js
: Fetches detailed information about a repository.allRepositories.js
: Fetches all public repositories of a user.
getProfileData(page)
: Retrieves profile information such as name, bio, followers, and repositories count.getRepositories(page)
: Retrieves a list of repositories with name, description, stars, and language.getRepositoryDetails(page)
: Retrieves detailed information about a repository, including commits, forks, contributors, and languages.getRepositoryAllRepositories(browser, username, pageCount)
: Retrieves all repositories of a user.
- Puppeteer: Used for web scraping GitHub data.
- Express: Handles API requests and server routing.
- Cors: Enables cross-origin requests.
- Include additional data like issues or pull requests.
- Support for fetching stars and forks for organizations.
Contributions are welcome! Feel free to submit pull requests to improve the project or add new features.