class Mechanize
The Mechanize
library is used for automating interactions with a website. It can follow links and submit forms. Form
fields can be populated and submitted. A history of URLs is maintained and can be queried.
Example¶ ↑
require 'mechanize' require 'logger' agent = Mechanize.new agent.log = Logger.new "mech.log" agent.user_agent_alias = 'Mac Safari' page = agent.get "http://www.google.com/" search_form = page.form_with :name => "f" search_form.field_with(:name => "q").value = "Hello" search_results = agent.submit search_form puts search_results.body
Issues with mechanize¶ ↑
If you think you have a bug with mechanize, but aren’t sure, please file a ticket at github.com/sparklemotion/mechanize/issues
Here are some common problems you may experience with mechanize
Problems connecting to SSL sites¶ ↑
Mechanize
defaults to validating SSL certificates using the default CA certificates for your platform. At this time, Windows users do not have integration between the OS default CA certificates and OpenSSL. cert_store
explains how to download and use Mozilla’s CA certificates to allow SSL sites to work.
Problems with content-length¶ ↑
Some sites return an incorrect content-length value. Unlike a browser, mechanize raises an error when the content-length header does not match the response length since it does not know if there was a connection problem or if the mismatch is a server bug.
The error raised, Mechanize::ResponseReadError
, can be converted to a parsed Page
, File
, etc. depending upon the content-type:
agent = Mechanize.new uri = URI 'http://example/invalid_content_length' begin page = agent.get uri rescue Mechanize::ResponseReadError => e page = e.force_parse end
Constants
- AGENT_ALIASES
-
Supported User-Agent aliases for use with user_agent_alias=. The description in parenthesis is for informative purposes and is not part of the alias name.
The default User-Agent alias:
-
“Mechanize”
Linux User-Agent aliases:
-
“Linux Firefox”
-
“Linux Konqueror”
-
“Linux Mozilla”
Mac User-Agent aliases:
-
“Mac Firefox”
-
“Mac Mozilla”
-
“Mac Safari 4”
-
“Mac Safari”
Windows User-Agent aliases:
-
“Windows Chrome”
-
“Windows Edge”
-
“Windows Firefox”
-
“Windows IE 6”
-
“Windows IE 7”
-
“Windows IE 8”
-
“Windows IE 9”
-
“Windows IE 10”
-
“Windows IE 11”
-
“Windows Mozilla”
Mobile User-Agent aliases:
-
“Android”
-
“iPad”
-
“iPhone”
Example:
agent = Mechanize.new agent.user_agent_alias = 'Mac Safari'
-
Public Class Methods
Source
# File lib/mechanize.rb, line 209 def initialize(connection_name = 'mechanize') @agent = Mechanize::HTTP::Agent.new(connection_name) @agent.context = self @log = nil # attr_accessors @agent.user_agent = AGENT_ALIASES['Mechanize'] @watch_for_set = nil @history_added = nil # attr_readers @pluggable_parser = PluggableParser.new @keep_alive_time = 0 # Proxy @proxy_addr = nil @proxy_port = nil @proxy_user = nil @proxy_pass = nil @html_parser = self.class.html_parser @default_encoding = nil @force_default_encoding = false # defaults @agent.max_history = 50 yield self if block_given? @agent.set_proxy @proxy_addr, @proxy_port, @proxy_user, @proxy_pass end
Creates a new mechanize instance. If a block is given, the created instance is yielded to the block for setting up pre-connection state such as SSL parameters or proxies:
agent = Mechanize.new do |a| a.proxy_addr = 'proxy.example' a.proxy_port = 8080 end
If you need segregated SSL connections give each agent a unique name. Otherwise the connections will be shared. This is particularly important if you are using certificates.
agent_1 = Mechanize.new 'conn1' agent_2 = Mechanize.new 'conn2'
Source
# File lib/mechanize.rb, line 184 def self.start instance = new yield(instance) ensure instance.shutdown end
Creates a new Mechanize
instance and yields it to the given block.
After the block executes, the instance is cleaned up. This includes closing all open connections.
Mechanize.start do |m| m.get("http://example.com") end
History
↑ topPublic Instance Methods
Source
# File lib/mechanize.rb, line 250 def back @agent.history.pop end
Equivalent to the browser back button. Returns the previous page visited.
Source
# File lib/mechanize.rb, line 257 def current_page @agent.current_page end
Returns the latest page loaded by Mechanize
Source
# File lib/mechanize.rb, line 266 def history @agent.history end
The history of this mechanize run
Source
# File lib/mechanize.rb, line 275 def max_history @agent.history.max_size end
Maximum number of items allowed in the history. The default setting is 50 pages. Note that the size of the history multiplied by the maximum response body size
Source
# File lib/mechanize.rb, line 289 def max_history= length @agent.history.max_size = length end
Sets the maximum number of items allowed in the history to length
.
Setting the maximum history length to nil will make the history size unlimited. Take care when doing this, mechanize stores response bodies in memory for pages and in the temporary files directory for other responses. For a long-running mechanize program this can be quite large.
See also the discussion under max_file_buffer=
Source
# File lib/mechanize.rb, line 296 def visited? url url = url.href if url.respond_to? :href @agent.visited_page url end
Returns a visited page for the url
passed in, otherwise nil
Hooks
↑ topAttributes
Callback which is invoked with the page that was added to history.
Public Instance Methods
Source
# File lib/mechanize.rb, line 317 def content_encoding_hooks @agent.content_encoding_hooks end
A list of hooks to call before reading response header ‘content-encoding’.
The hook is called with the agent making the request, the URI of the request, the response an IO containing the response body.
Source
# File lib/mechanize.rb, line 330 def post_connect_hooks @agent.post_connect_hooks end
A list of hooks to call after retrieving a response. Hooks are called with the agent, the URI, the response, and the response body.
Source
# File lib/mechanize.rb, line 338 def pre_connect_hooks @agent.pre_connect_hooks end
A list of hooks to call before retrieving a response. Hooks are called with the agent, the URI, the response, and the response body.
Requests
↑ topPublic Instance Methods
Source
# File lib/mechanize.rb, line 351 def click link case link when Page::Link then referer = link.page || current_page() if @agent.robots if (referer.is_a?(Page) and referer.parser.nofollow?) or link.rel?('nofollow') then raise RobotsDisallowedError.new(link.href) end end if link.noreferrer? href = @agent.resolve(link.href, link.page || current_page) referer = Page.new else href = link.href end get href, [], referer when String, Regexp then if real_link = page.link_with(:text => link) click real_link else button = nil # Note that this will not work if we have since navigated to a different page. # Should rather make each button aware of its parent form. form = page.forms.find do |f| button = f.button_with(:value => link) button.is_a? Form::Submit end submit form, button if form end when Form::Submit, Form::ImageButton then # Note that this will not work if we have since navigated to a different page. # Should rather make each button aware of its parent form. form = page.forms.find do |f| f.buttons.include?(link) end submit form, link if form else referer = current_page() href = link.respond_to?(:href) ? link.href : (link['href'] || link['src']) get href, [], referer end end
If the parameter is a string, finds the button or link with the value of the string on the current page and clicks it. Otherwise, clicks the Mechanize::Page::Link
object passed in. Returns the page fetched.
Source
# File lib/mechanize.rb, line 445 def delete(uri, query_params = {}, headers = {}) page = @agent.fetch(uri, :delete, headers, query_params) add_to_history(page) page end
DELETE uri
with query_params
, and setting headers
:
query_params
is formatted into a query string using Mechanize::Util.build_query_string
, which see.
delete('http://example/', {'q' => 'foo'}, {})
Source
# File lib/mechanize.rb, line 410 def download uri, io_or_filename, parameters = [], referer = nil, headers = {} page = transact do get uri, parameters, referer, headers end io = if io_or_filename.respond_to? :write then io_or_filename else ::File.open(io_or_filename, 'wb') end case page when Mechanize::File then io.write page.body else body_io = page.body_io until body_io.eof? do io.write body_io.read 16384 end end page ensure io.close if io and not io_or_filename.respond_to? :write end
GETs uri
and writes it to io_or_filename
without recording the request in the history. If io_or_filename
does not respond to write it will be used as a file name. parameters
, referer
and headers
are used as in get
.
By default, if the Content-type of the response matches a Mechanize::File
or Mechanize::Page
parser, the response body will be loaded into memory before being saved. See pluggable_parser
for details on changing this default.
For alternate ways of downloading files see Mechanize::FileSaver
and Mechanize::DirectorySaver
.
Source
# File lib/mechanize.rb, line 460 def get(uri, parameters = [], referer = nil, headers = {}) method = :get referer ||= if uri.to_s =~ %r{\Ahttps?://} Page.new else current_page || Page.new end # FIXME: Huge hack so that using a URI as a referer works. I need to # refactor everything to pass around URIs but still support # Mechanize::Page#base unless Mechanize::Parser === referer then referer = if referer.is_a?(String) then Page.new URI(referer) else Page.new referer end end # fetch the page headers ||= {} page = @agent.fetch uri, method, headers, parameters, referer add_to_history(page) yield page if block_given? page end
GET the uri
with the given request parameters
, referer
and headers
.
The referer
may be a URI or a page.
parameters
is formatted into a query string using Mechanize::Util.build_query_string
, which see.
Source
# File lib/mechanize.rb, line 492 def get_file(url) get(url).body end
GET url
and return only its contents
Source
# File lib/mechanize.rb, line 504 def head(uri, query_params = {}, headers = {}) page = @agent.fetch uri, :head, headers, query_params yield page if block_given? page end
HEAD uri
with query_params
and headers
:
query_params
is formatted into a query string using Mechanize::Util.build_query_string
, which see.
head('http://example/', {'q' => 'foo'}, {})
Source
# File lib/mechanize.rb, line 529 def post(uri, query = {}, headers = {}) return request_with_entity(:post, uri, query, headers) if String === query node = {} # Create a fake form class << node def search(*args); []; end end node['method'] = 'POST' node['enctype'] = 'application/x-www-form-urlencoded' form = Form.new(node) Mechanize::Util.each_parameter(query) { |k, v| if v.is_a?(IO) form.enctype = 'multipart/form-data' ul = Form::FileUpload.new({'name' => k.to_s},::File.basename(v.path)) ul.file_data = v.read form.file_uploads << ul elsif v.is_a?(Form::FileUpload) form.enctype = 'multipart/form-data' form.file_uploads << v else form.fields << Form::Field.new({'name' => k.to_s},v) end } post_form(uri, form, headers) end
POST to the given uri
with the given query
.
query
is processed using Mechanize::Util.each_parameter
(which see), and then encoded into an entity body. If any IO/FileUpload object is specified as a field value the “enctype” will be multipart/form-data, or application/x-www-form-urlencoded otherwise.
Examples:
agent.post 'http://example.com/', "foo" => "bar" agent.post 'http://example.com/', [%w[foo bar]] agent.post('http://example.com/', "<message>hello</message>", 'Content-Type' => 'application/xml')
Source
# File lib/mechanize.rb, line 563 def put(uri, entity, headers = {}) request_with_entity(:put, uri, entity, headers) end
PUT to uri
with entity
, and setting headers
:
put('http://example/', 'new content', {'Content-Type' => 'text/plain'})
Source
# File lib/mechanize.rb, line 571 def request_with_entity(verb, uri, entity, headers = {}) cur_page = current_page || Page.new log.debug("query: #{ entity.inspect }") if log headers = { 'Content-Type' => 'application/octet-stream', 'Content-Length' => entity.size.to_s, }.update headers page = @agent.fetch uri, verb, headers, [entity], cur_page add_to_history(page) page end
Makes an HTTP
request to url
using HTTP
method verb
. entity
is used as the request body, if allowed.
Source
# File lib/mechanize.rb, line 598 def submit(form, button = nil, headers = {}) form.add_button_to_query(button) if button case form.method.upcase when 'POST' post_form(form.action, form, headers) when 'GET' get(form.action.gsub(/\?[^\?]*$/, ''), form.build_query, form.page, headers) else raise ArgumentError, "unsupported method: #{form.method.upcase}" end end
Submits form
with an optional button
.
Without a button:
page = agent.get('http://example.com') agent.submit(page.forms.first)
With a button:
agent.submit(page.forms.first, page.forms.first.buttons.first)
Source
# File lib/mechanize.rb, line 618 def transact history_backup = @agent.history.dup begin yield self ensure @agent.history = history_backup end end
Runs given block, then resets the page history as it was before. self is given as a parameter to the block. Returns the value of the block.
SSL
↑ topPublic Instance Methods
Source
# File lib/mechanize.rb, line 1117 def ca_file @agent.ca_file end
Path to an OpenSSL server certificate file
Source
# File lib/mechanize.rb, line 1124 def ca_file= ca_file @agent.ca_file = ca_file end
Sets the certificate file used for SSL connections
Source
# File lib/mechanize.rb, line 1131 def cert @agent.certificate end
An OpenSSL client certificate or the path to a certificate file.
Source
# File lib/mechanize.rb, line 1139 def cert= cert @agent.certificate = cert end
Sets the OpenSSL client certificate cert
to the given path or certificate instance
Source
# File lib/mechanize.rb, line 1165 def cert_store @agent.cert_store end
An OpenSSL certificate store for verifying server certificates. This defaults to the default certificate store for your system.
If your system does not ship with a default set of certificates you can retrieve a copy of the set from Mozilla here: curl.haxx.se/docs/caextract.html
(Note that this set does not have an HTTPS download option so you may wish to use the firefox-db2pem.sh script to extract the certificates from a local install to avoid man-in-the-middle attacks.)
After downloading or generating a cacert.pem from the above link you can create a certificate store from the pem file like this:
cert_store = OpenSSL::X509::Store.new cert_store.add_file 'cacert.pem'
And have mechanize use it with:
agent.cert_store = cert_store
Source
# File lib/mechanize.rb, line 1174 def cert_store= cert_store @agent.cert_store = cert_store end
Sets the OpenSSL certificate store to store
.
See also cert_store
Source
# File lib/mechanize.rb, line 1190 def key @agent.private_key end
An OpenSSL private key or the path to a private key
Source
# File lib/mechanize.rb, line 1198 def key= key @agent.private_key = key end
Sets the OpenSSL client key
to the given path or key instance. If a path is given, the path must contain an RSA key file.
Source
# File lib/mechanize.rb, line 1205 def pass @agent.pass end
OpenSSL client key password
Source
# File lib/mechanize.rb, line 1212 def pass= pass @agent.pass = pass end
Sets the client key password to pass
Source
# File lib/mechanize.rb, line 1219 def ssl_version @agent.ssl_version end
SSL version to use.
Source
# File lib/mechanize.rb, line 1227 def ssl_version= ssl_version @agent.ssl_version = ssl_version end
Sets the SSL version to use to version
without client/server negotiation.
Source
# File lib/mechanize.rb, line 1239 def verify_callback @agent.verify_callback end
A callback for additional certificate verification. See OpenSSL::SSL::SSLContext#verify_callback
The callback can be used for debugging or to ignore errors by always returning true
. Specifying nil uses the default method that was valid when the SSLContext was created
Source
# File lib/mechanize.rb, line 1246 def verify_callback= verify_callback @agent.verify_callback = verify_callback end
Sets the OpenSSL certificate verification callback
Source
# File lib/mechanize.rb, line 1255 def verify_mode @agent.verify_mode end
the OpenSSL server certificate verification method. The default is OpenSSL::SSL::VERIFY_PEER and certificate verification uses the default system certificates. See also cert_store
Source
# File lib/mechanize.rb, line 1262 def verify_mode= verify_mode @agent.verify_mode = verify_mode end
Sets the OpenSSL server certificate verification method.
Settings
↑ topAttributes
Default HTML parser for all mechanize instances
Mechanize.html_parser = Nokogiri::XML
Default logger for all mechanize instances
Mechanize.log = Logger.new $stderr
A default encoding name used when parsing HTML parsing. When set it is used after any other encoding. The default is nil.
Overrides the encodings given by the HTTP
server and the HTML page with the default_encoding
when set to true.
The HTML parser to be used when parsing documents
HTTP/1.0 keep-alive time. This is no longer supported by mechanize as it now uses net-http-persistent which only supports HTTP/1.1 persistent connections
The pluggable parser maps a response Content-Type to a parser class. The registered Content-Type may be either a full content type like ‘image/png’ or a media type ‘text’. See Mechanize::PluggableParser
for further details.
Example:
agent.pluggable_parser['application/octet-stream'] = Mechanize::Download
The HTTP
proxy address
The HTTP
proxy password
The HTTP
proxy port
The HTTP
proxy username
The value of watch_for_set
is passed to pluggable parsers for retrieved content
Public Instance Methods
Source
# File lib/mechanize.rb, line 742 def add_auth uri, user, password, realm = nil, domain = nil @agent.add_auth uri, user, password, realm, domain end
Adds credentials user
, pass
for uri
. If realm
is set the credentials are used only for that realm. If realm
is not set the credentials become the default for any realm on that URI.
domain
and realm
are exclusive as NTLM does not follow RFC 2617. If domain
is given it is only used for NTLM authentication.
Source
# File lib/mechanize.rb, line 719 def auth user, password, domain = nil c = caller_locations(1,1).first warn <<-WARNING At #{c.absolute_path} line #{c.lineno} Use of #auth and #basic_auth are deprecated due to a security vulnerability. WARNING @agent.add_default_auth user, password, domain end
NOTE: These credentials will be used as a default for any challenge exposing your password to disclosure to malicious servers. Use of this method will warn. This method is deprecated and will be removed in mechanize 3.
Sets the user
and password
as the default credentials to be used for HTTP
authentication for any server. The domain
is used for NTLM authentication.
Source
# File lib/mechanize.rb, line 749 def conditional_requests @agent.conditional_requests end
Are If-Modified-Since conditional requests enabled?
Source
# File lib/mechanize.rb, line 756 def conditional_requests= enabled @agent.conditional_requests = enabled end
Disables If-Modified-Since conditional requests (enabled by default)
Source
# File lib/mechanize.rb, line 785 def follow_meta_refresh @agent.follow_meta_refresh end
Follow HTML meta refresh and HTTP
Refresh headers. If set to :anywhere
meta refresh tags outside of the head element will be followed.
Source
# File lib/mechanize.rb, line 793 def follow_meta_refresh= follow @agent.follow_meta_refresh = follow end
Controls following of HTML meta refresh and HTTP
Refresh headers in responses.
Source
# File lib/mechanize.rb, line 803 def follow_meta_refresh_self @agent.follow_meta_refresh_self end
Follow an HTML meta refresh and HTTP
Refresh headers that have no “url=” in the content attribute.
Defaults to false to prevent infinite refresh loops.
Source
# File lib/mechanize.rb, line 811 def follow_meta_refresh_self= follow @agent.follow_meta_refresh_self = follow end
Alters the following of HTML meta refresh and HTTP
Refresh headers that point to the same page.
Source
# File lib/mechanize.rb, line 818 def gzip_enabled @agent.gzip_enabled end
Is gzip compression of responses enabled?
Source
# File lib/mechanize.rb, line 825 def gzip_enabled=enabled @agent.gzip_enabled = enabled end
Disables HTTP/1.1 gzip compression (enabled by default)
Source
# File lib/mechanize.rb, line 832 def idle_timeout @agent.idle_timeout end
Connections that have not been used in this many seconds will be reset.
Source
# File lib/mechanize.rb, line 840 def idle_timeout= idle_timeout @agent.idle_timeout = idle_timeout end
Sets the idle timeout to idle_timeout
. The default timeout is 5 seconds. If you experience “too many connection resets”, reducing this value may help.
Source
# File lib/mechanize.rb, line 854 def ignore_bad_chunking @agent.ignore_bad_chunking end
When set to true mechanize will ignore an EOF during chunked transfer encoding so long as at least one byte was received. Be careful when enabling this as it may cause data loss.
Net::HTTP
does not inform mechanize of where in the chunked stream the EOF occurred. Usually it is after the last-chunk but before the terminating CRLF (invalid termination) but it may occur earlier. In the second case your response body may be incomplete.
Source
# File lib/mechanize.rb, line 862 def ignore_bad_chunking= ignore_bad_chunking @agent.ignore_bad_chunking = ignore_bad_chunking end
When set to true mechanize will ignore an EOF during chunked transfer encoding. See ignore_bad_chunking
for further details
Source
# File lib/mechanize.rb, line 869 def keep_alive @agent.keep_alive end
Are HTTP/1.1 keep-alive connections enabled?
Source
# File lib/mechanize.rb, line 880 def keep_alive= enable @agent.keep_alive = enable end
Disable HTTP/1.1 keep-alive connections if enable
is set to false. If you are experiencing “too many connection resets” errors setting this to false will eliminate them.
You should first investigate reducing idle_timeout.
Source
# File lib/mechanize.rb, line 887 def log @log || Mechanize.log end
The current logger. If no logger has been set Mechanize.log
is used.
Source
# File lib/mechanize.rb, line 894 def log= logger @log = logger end
Sets the logger
used by this instance of mechanize
Source
# File lib/mechanize.rb, line 904 def max_file_buffer @agent.max_file_buffer end
Responses larger than this will be written to a Tempfile instead of stored in memory. The default is 100,000 bytes.
A value of nil disables creation of Tempfiles.
Source
# File lib/mechanize.rb, line 921 def max_file_buffer= bytes @agent.max_file_buffer = bytes end
Sets the maximum size of a response body that will be stored in memory to bytes
. A value of nil causes all response bodies to be stored in memory.
Note that for Mechanize::Download
subclasses, the maximum buffer size multiplied by the number of pages stored in history (controlled by max_history
) is an approximate upper limit on the amount of memory Mechanize
will use. By default, Mechanize
can use up to ~5MB to store response bodies for non-File and non-Page (HTML) responses.
See also the discussion under max_history=
Source
# File lib/mechanize.rb, line 928 def open_timeout @agent.open_timeout end
Length of time to wait until a connection is opened in seconds
Source
# File lib/mechanize.rb, line 935 def open_timeout= open_timeout @agent.open_timeout = open_timeout end
Sets the connection open timeout to open_timeout
Source
# File lib/mechanize.rb, line 942 def read_timeout @agent.read_timeout end
Length of time to wait for data from the server
Source
# File lib/mechanize.rb, line 950 def read_timeout= read_timeout @agent.read_timeout = read_timeout end
Sets the timeout for each chunk of data read from the server to read_timeout
. A single request may read many chunks of data.
Source
# File lib/mechanize.rb, line 977 def redirect_ok @agent.redirect_ok end
Controls how mechanize deals with redirects. The following values are allowed:
- :all, true
-
All 3xx redirects are followed (default)
- :permanent
-
Only 301 Moved Permanently redirects are followed
- false
-
No redirects are followed
Source
# File lib/mechanize.rb, line 987 def redirect_ok= follow @agent.redirect_ok = follow end
Sets the mechanize redirect handling policy. See redirect_ok
for allowed values
Source
# File lib/mechanize.rb, line 996 def redirection_limit @agent.redirection_limit end
Maximum number of redirections to follow
Source
# File lib/mechanize.rb, line 1003 def redirection_limit= limit @agent.redirection_limit = limit end
Sets the maximum number of redirections to follow to limit
Source
# File lib/mechanize.rb, line 1016 def request_headers @agent.request_headers end
A hash of custom request headers that will be sent on every request
Source
# File lib/mechanize.rb, line 1024 def request_headers= request_headers @agent.request_headers = request_headers end
Replaces the custom request headers that will be sent on every request with request_headers
Source
# File lib/mechanize.rb, line 1009 def resolve link @agent.resolve link end
Resolve the full path of a link / uri
Source
# File lib/mechanize.rb, line 1031 def retry_change_requests @agent.retry_change_requests end
Retry POST and other non-idempotent requests. See RFC 2616 9.1.2.
Source
# File lib/mechanize.rb, line 1045 def retry_change_requests= retry_change_requests @agent.retry_change_requests = retry_change_requests end
When setting retry_change_requests
to true you are stating that, for all the URLs you access with mechanize, making POST and other non-idempotent requests is safe and will not cause data duplication or other harmful results.
If you are experiencing “too many connection resets” errors you should instead investigate reducing the idle_timeout
or disabling keep_alive
connections.
Source
# File lib/mechanize.rb, line 1052 def robots @agent.robots end
Will /robots.txt
files be obeyed?
Source
# File lib/mechanize.rb, line 1060 def robots= enabled @agent.robots = enabled end
When enabled
mechanize will retrieve and obey robots.txt
files
Source
# File lib/mechanize.rb, line 1067 def scheme_handlers @agent.scheme_handlers end
The handlers for HTTP
and other URI protocols.
Source
# File lib/mechanize.rb, line 1074 def scheme_handlers= scheme_handlers @agent.scheme_handlers = scheme_handlers end
Replaces the URI scheme handler table with scheme_handlers
Source
# File lib/mechanize.rb, line 1081 def user_agent @agent.user_agent end
The identification string for the client initiating a web request
Source
# File lib/mechanize.rb, line 1089 def user_agent= user_agent @agent.user_agent = user_agent end
Sets the User-Agent used by mechanize to user_agent
. See also user_agent_alias
Source
# File lib/mechanize.rb, line 1098 def user_agent_alias= name self.user_agent = AGENT_ALIASES[name] || raise(ArgumentError, "unknown agent alias #{name.inspect}") end
Set the user agent for the Mechanize
object based on the given name
.
See also AGENT_ALIASES
Source
# File lib/mechanize.rb, line 957 def write_timeout @agent.write_timeout end
Length of time to wait for data to be sent to the server
Source
# File lib/mechanize.rb, line 965 def write_timeout= write_timeout @agent.write_timeout = write_timeout end
Sets the timeout for each chunk of data to be sent to the server to write_timeout
. A single request may write many chunks of data.
Utilities
↑ topConstants
- Cookie
- VERSION
Public Instance Methods
Source
# File lib/mechanize.rb, line 1274 def parse uri, response, body content_type = nil unless response['Content-Type'].nil? data, = response['Content-Type'].split ';', 2 content_type, = data.downcase.split ',', 2 unless data.nil? end parser_klass = @pluggable_parser.parser content_type unless parser_klass <= Mechanize::Download then body = case body when IO, Tempfile, StringIO then body.read else body end end parser_klass.new uri, response, body, response.code do |parser| parser.mech = self if parser.respond_to? :mech= parser.watch_for_set = @watch_for_set if @watch_for_set and parser.respond_to?(:watch_for_set=) end end
Parses the body
of the response
from uri
using the pluggable parser that matches its content type
Source
# File lib/mechanize.rb, line 1325 def reset @agent.reset end
Clears history and cookies.
Source
# File lib/mechanize.rb, line 1313 def set_proxy address, port, user = nil, password = nil @proxy_addr = address @proxy_port = port @proxy_user = user @proxy_pass = password @agent.set_proxy address, port, user, password end
Sets the proxy address
at port
with an optional user
and password
Source
# File lib/mechanize.rb, line 1333 def shutdown reset @agent.shutdown end
Shuts down this session by clearing browsing state and closing all persistent connections.
Private Instance Methods
Source
# File lib/mechanize.rb, line 1365 def add_to_history(page) @agent.history.push(page, @agent.resolve(page.uri)) @history_added.call(page) if @history_added end
Adds page
to the history
Source
# File lib/mechanize.rb, line 1343 def post_form(uri, form, headers = {}) cur_page = form.page || current_page || Page.new request_data = form.request_data log.debug("query: #{ request_data.inspect }") if log headers = { 'Content-Type' => form.enctype, 'Content-Length' => request_data.size.to_s, }.merge headers # fetch the page page = @agent.fetch uri, :post, headers, [request_data], cur_page add_to_history(page) page end
Posts form
to uri