Module: Spidr::Sanitizers

Defined in:
lib/spidr/sanitizers.rb

Class Method Summary

Instance Method Summary

Class Method Details

+ (Object) included(base)



5
6
7
8
9
10
11
12
13
# File 'lib/spidr/sanitizers.rb', line 5

def self.included(base)
  base.module_eval do
    # Specifies whether the Agent will strip URI fragments
    attr_accessor :strip_fragments

    # Specifies whether the Agent will strip URI queries
    attr_accessor :strip_query
  end
end

Instance Method Details

- (Sanitizers) initialize(options = {})

Initializes the sanitization rules.

Parameters:

  • (Hash) options (defaults to: {}) — Additional options.

Options Hash (options):

  • (Boolean) :strip_fragments — default: true — Specifies whether or not to strip the fragment component from URLs.
  • (Boolean) :strip_query — default: false — Specifies whether or not to strip the query component from URLs.

Since:

  • 0.2.2


29
30
31
32
33
34
35
36
37
# File 'lib/spidr/sanitizers.rb', line 29

def initialize(options={})
  @strip_fragments = true
  
  if options.has_key?(:strip_fragments)
    @strip_fragments = options[:strip_fragments]
  end

  @strip_query = (options[:strip_query] || false)
end

- (URI::HTTP, URI::HTTPS) sanitize_url(url)

Sanitizes a URL based on filtering options.

Parameters:

  • (URI::HTTP, URI::HTTPS, String) url — The URL to be sanitized

Returns:

  • (URI::HTTP, URI::HTTPS) — The new sanitized URL.

Since:

  • 0.2.2


50
51
52
53
54
55
56
57
# File 'lib/spidr/sanitizers.rb', line 50

def sanitize_url(url)
  url = URI(url.to_s)

  url.fragment = nil if @strip_fragments
  url.query = nil if @strip_query

  return url
end