Set Up Web Snapshots

Prev Next

This article goes over all the different options for setting up web snapshots in your archive.

Checking Your Website Sitemap Compatibility

  1. We use XML /glossarysitemaps to capture. Please check with your web host or IT team if you cannot find one for your website

  2. You must allowlist the three IP addresses we use (all AWS) for Web Snapshots:

    • 52.23.29.34

    • 54.235.88.205

    • 54.84.64.101

      • Should your IT team prefer to whitelist a user agent, they can create a filter that allows user agents that include the text "ASWebsnapshotsUserAgent". The other information included in our user agent is dynamic and will change as we upgrade the browser we are using to capture the websites

  3. The sitemap and URL list must match your website's domain

  4. If you use Google Analytics, you must filter out our Web Snapshots traffic


Adding Your Website Sitemap For Capture

  1. Log into your archive

  2. Navigate to the Configure tab:The Configure tab in the top navigation menu.

  3. Click the Web Snapshots tab:The Web Snapshots tab in the left navigation menu.

  4. If this is your first time setting up Web Snapshots, you will need to choose an

  5. When the page reloads, click on the Add Sitemap button:Add Sitemap button.

  6. Fill out the Add Sitemap Menu:Add Sitemap menu.

    Note:

    If you do not have a sitemap.xml or sitemap_index.xml defined for your website, you may provide the URL for an HTML page that contains links to all the URLs that you want to snapshot. This will allow Web Snapshots to automatically detect new and changed URLs as you keep your sitemap or the sitemap index up to date.

    • Sitemap Name: ArchiveSocial recommends the following naming convention: City of {name} Gov. Site - XML. This will allow the agency to easily identify what sites are connected and the type of sitemap being used.

    • Sitemap Format: Choose the format of the sitemap you are entering (we suggest XML)

    • Sitemap URL: the full URL for the sitemap you are adding. For example: http://example.org/path/sitemap.xml

  7. Click Save Sitemap:Save Sitemap button.


Adding Specific URLs

  1. Navigate to the Configure tab:The Configure tab in the top navigation menu.

  2. Select the Web Snapshots tab:The Web Snapshots tab in the left navigation menu.

  3. Click Add Site URLSocial Media Archiving Add Site URL.

  4. Enter the full URL address for the page:The Site URL field.

  5. Click the Save Site URL button


Dynamic Option

Dynamic content is web content that changes based on the behavior, preference, and general interest of a site visitor. This content can be found on websites and in email content and is generated when a user accesses a page. The content is often personalized and what is displayed is based on the data a site has for a user and the time of access. The primary use of this content is to deliver a more positive experience for the end-user.

By default, the Dynamic option for Web Snapshots is turned off. Web Snapshots will detect changes to a page on a site using the XML sitemap and record the change in the archive once a day.

However, if Dynamic is enabled for a sitemap or URL, any widgets on the site or page (such as weather updates, a blog on the website, or a calendar of events), they will be captured once daily for all pages where the widgets appear. For example, if your website has 100 pages and 12 of these pages have a dynamic widget, Web Snapshots will capture those 12 pages once per day as the widgets update.

Note:

Note that enabling the Dynamic option could lead to exceeding record limits each month.


Turning Off Specific URLs

  1. Navigate to the Configure tab:The Configure tab in the top navigation menu.

  2. Select the Web Snapshots tab:The Web Snapshots tab in the left navigation menu.

  3. For the URL you wish to turn off, click the gear icon under the Action section:Actions column gear icons on the Site URLs page.

  4. Toggle the Archiving switch to OFF:Archiving toggle on the Configure URL pop-up..

  5. Click the Save button:The Save button on the Configure URL pop-up.