Skip to content

Get search results from google, bing, duckduckgo, etc easily using GoSearch

Notifications You must be signed in to change notification settings

RahulSDevloper/GoSearch-Search-Engine-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” Search Engine Scraper - GoSearch

GoSearch

   _____                     _     _____             _            
  / ____|                   | |   |  ___|           (_)           
 | (___   ___  __ _ _ __ ___| |__ | |__ _ __   __ _  _ _ __   ___ 
  \___ \ / _ \/ _` | '__/ __| '_ \|  __| '_ \ / _` || | '_ \ / _ \
  ____) |  __/ (_| | | | (__| | | | |__| | | | (_| || | | | |  __/
 |_____/ \___|\__,_|_|  \___|_| |_\____/_| |_|\__, ||_|_| |_|\___|
   _____                                        __/ |              
  / ____|                                      |___/               
 | (___   ___ _ __ __ _ _ __   ___ _ __                            
  \___ \ / __| '__/ _` | '_ \ / _ \ '__|                           
  ____) | (__| | | (_| | |_) |  __/ |                              
 |_____/ \___|_|  \__,_| .__/ \___|_|                              
                       | |                                         
                       |_|                                         
Go Version Powered by Chromedp Version Search Engine Scraper Demo

High-performance, anti-detection search engine scraper - Built with advanced Go concurrency patterns


✨ Key Features


Google, Bing & DuckDuckGo

Bypass CAPTCHAs & Blocks

Chrome-Based Scraping

Domain, Keyword & More

Keyword Extraction & Ad Detection

Avoid Rate Limiting

πŸš€ Installation

MethodCommands
From Binary
# Download the latest release
curl -sSL https://github.com/RahulSDevloper/Search-Engine-Scraper---Golang/releases/download/v1.0.0/gosearch-linux-amd64 -o gosearch
chmod +x gosearch
./gosearch --query "golang programming"
From Source
git clone https://github.com/RahulSDevloper/Search-Engine-Scraper---Golang.git
cd Search-Engine-Scraper---Golang
go build -ldflags="-s -w" -o gosearch
./gosearch --query "golang programming"
Using Docker
docker pull rahulsdevloper/gosearch:latest
docker run rahulsdevloper/gosearch --query "golang programming"

πŸ”§ Usage

Usage: gosearch [OPTIONS] [QUERY]

Options:
  --query string         Search query
  --engine string        Search engine (google, bing, duckduckgo, all) (default "google")
  --max int              Maximum results to fetch (default 10)
  --ads                  Include advertisements in results
  --timeout duration     Search timeout (default 30s)
  --proxy string         Proxy URL (e.g., http://user:pass@host:port)
  --headless             Use headless browser (recommended for avoiding detection)
  --lang string          Language code (default "en")
  --region string        Region code (default "us")
  --format string        Output format (json, csv, table) (default "json")
  --output string        Output file (default: stdout)
  --page int             Result page number (default 1)
  --min-words int        Minimum word count in description
  --max-words int        Maximum word count in description
  --domain string        Filter results by domain (include)
  --exclude-domain string Filter results by domain (exclude)
  --keyword string       Filter results by keyword
  --type string          Filter by result type (organic, special, etc.)
  --site string          Limit results to specific site
  --filetype string      Limit results to specific file type
  --verbose              Enable verbose logging
  --debug                Enable debug mode (saves HTML responses)
  --log string           Log file path
  --stats string         Statistics output file
  --help                 Show help

🌟 Examples

Basic Search with Google πŸ”
./gosearch --query "golang programming"
Basic Search Example
Search with Advanced Filters 🧰
./gosearch --query "machine learning" --engine bing --domain edu --format table
Advanced Search Example
Multi-Engine Search with Headless Browser 🌐
./gosearch --query "climate science" --engine all --headless --output results.json
Multi-Engine Example
Filetype Specific Search πŸ“„
./gosearch --query "research papers" --filetype pdf --site edu --max 20

🧠 Advanced Techniques

Advanced Features

Using as a Library

package main

import (
    "context"
    "fmt"
    "time"
    
    "github.com/RahulSDevloper/Search-Engine-Scraper---Golang/pkg/engines"
    "github.com/RahulSDevloper/Search-Engine-Scraper---Golang/pkg/models"
)

func main() {
    // Create a new Google search engine
    engine := engines.NewGoogleSearchEngine()
    
    // Configure search request with optimization strategy
    request := models.SearchRequest{
        Query:       "golang concurrency patterns",
        MaxResults:  10,
        Timeout:     30 * time.Second,
        UseHeadless: true,
        Debug:       true,
    }
    
    // Execute search with context for cancellation
    ctx, cancel := context.WithTimeout(context.Background(), 45*time.Second)
    defer cancel()
    
    results, err := engine.Search(ctx, request)
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }
    
    // Process and analyze results
    for i, result := range results {
        fmt.Printf("%d. %s\n%s\n\n", i+1, result.Title, result.URL)
    }
}

Custom Rate Limiting

# ~/.config/gosearch/config.yaml
rate_limits:
  google: 10   # requests per minute
  bing: 15
  duckduckgo: 20

proxy_rotation:
  enabled: true
  proxies:
    - http://proxy1:8080
    - http://proxy2:8080
  rotation_strategy: round-robin  # or random

🐞 Debugging

Debugging Techniques

No Results Found?

If you're not getting any results, try these solutions:

  1. Use Headless Mode to avoid detection

    ./gosearch --query "your search" --headless
  2. Use a Proxy to route through a clean IP address

    ./gosearch --query "your search" --proxy http://your-proxy-server:port
  3. Enable Debug Mode to examine the HTML response

    ./gosearch --query "your search" --debug

Debugging Process Flow

graph TD
    A[Run Search] --> B{Results Found?}
    B -->|Yes| C[Process Results]
    B -->|No| D[Enable Debug Mode]
    D --> E[Check HTML Responses]
    E --> F{Captcha Present?}
    F -->|Yes| G[Use Headless + Proxy]
    F -->|No| H[Check Selectors]
    H --> I[Update Selectors]
    I --> A
    G --> A
Loading

πŸ“Š Performance Benchmarks

EngineResults/SecondMemory UsageDetection Avoidance
Google6.5LowHigh
Bing8.2LowMedium
DuckDuckGo7.3LowVery High
All (Concurrent)4.8MediumMedium

πŸ“š Design Philosophy

The Search Engine Scraper follows these core principles:

  1. Resilience First: Designed to handle the constantly changing DOM structures of search engines
  2. Performance Focused: Optimized for speed while maintaining low resource usage
  3. Privacy Conscious: Minimal footprint to avoid detection
  4. Developer Friendly: Clean API for integration into other Go applications

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with love by RahulSDevloper

⭐ Star this project if you find it useful! ⭐