c# – Small sitemap checker .aspx file

I wrote a small sitemap-checker.aspx, the goal of this code is to find the pages, that respond with the codes other than 200. I’ve only put the C# part here for the review:

    void Start_Click(object sender, EventArgs e)
    {
        CheckUrls(UrlTextBox.Text);
    }

    private void CheckUrls(string url)
    {
        try
        {
            var xmlDoc = LoadXml(url);
            var urlNodes = xmlDoc.GetElementsByTagName("url");
            var numberOfNodes = urlNodes.Count;
            var index = 0;
            
            Log("### Total nodes: " + numberOfNodes+ "<br>");
            Log("Here is the list of URLs, with the status code other than 200: <br>");
            foreach (XmlNode urlNode in urlNodes)
            {
                CheckUrlNode(urlNode, xmlDoc);
                index++;

                if(index % 10 == 0)
                {
                    Log("<span style='color:red'>" + index + " nodes of " + numberOfNodes + " are finished, please wait...</span>");
                }
            };
            Log("### Sitemap check finished. Please analyze the reported urls<br><br>");
        }
        catch (Exception e)
        {
            Log("Some problems were encountered while checking the sitemap<br>");
            Log(e.Message + "<br>");
            Log(e.StackTrace + "<br>");
        }
    }

    private XmlDocument LoadXml(string url)
    {
        var m_strFilePath = url;
        string xmlStr;
        using (var wc = new WebClient())
        {
            xmlStr = wc.DownloadString(m_strFilePath);
        }
        var xmlDoc = new XmlDocument();
        xmlDoc.LoadXml(xmlStr);
        return xmlDoc;
    }

    private void CheckUrlNode(XmlNode xmlNode, XmlDocument xmlDoc)
    {
        var urlList = new List<string>() { xmlNode("loc").InnerText };

        var nsmgr = new XmlNamespaceManager(xmlDoc.NameTable);
        nsmgr.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml");

        var hrefLangUrls = xmlNode
            .SelectNodes("xhtml:link", nsmgr)
            .Cast<XmlNode>()
            .Select(x => x.Attributes("href").Value);

        urlList.AddRange(hrefLangUrls);

        foreach (var url in urlList)
        {
            string logMessage;
            if (!CheckUrl(url, out logMessage))
            {
                Log(logMessage);
            }
        }
    }

    private bool CheckUrl(string url, out string logMessage)
    {
        logMessage = null;
        try
        {
            var request = HttpWebRequest.Create(url) as HttpWebRequest;
            request.Timeout = 5000; //set the timeout to 5 seconds to keep the user from waiting too long for the page to load
            request.Method = "HEAD"; //Get only the header information -- no need to download any content

            using (var response = request.GetResponse() as HttpWebResponse)
            {
                int statusCode = (int)response.StatusCode;
                if (statusCode == 200) //Good requests
                {
                    return true;
                }
                else
                {
                    logMessage = String.Format("<a href='{0}'>{0}</a>, status code: {1}", url, statusCode);
                    return false;
                }
            }
        }
        catch (WebException ex)
        {
            logMessage = String.Format("<a href='{0}'>{0}</a>, status: {1}", url, ex.Status);
        }
        catch (Exception ex)
        {
            logMessage = String.Format("Could not test url {0}, exeption message: {1}", url, ex.Message);
        }

        return false;
    }

Any Free Image sitemap generator?

this i think is what you need : auditmypc.com/free-sitemap-generator.asp

you can choose to only include “.jpg” or what ever you need , it uses a java applet in your client to follow every link and spider every document / page , only saving those you choose in your site map.

it can produce urls.txt for yahoo ( old now ) or sitemap.xml for google and bing , can also save a nicely formated .html file for users to browse, if you wanted such a think..

oh and its free :p

 

How do I add both a base domain and a subdomain to Google Search Console and submit an XML sitemap for each?

The best way is the one you mentioned in point 2.

You will have to add two separate properties in your search console

  1. http://example.com
  2. http://subdomain.example.com

The procedure in point 1 is not possible as of now, because the new search console will not allow you to add a sitemap for any subdomain, inside the http://example.com property.

Also the new search console automatically adds a domain property which consolidates all the properties in your domain including all subdomains. Inside the Domain property you can add any sitemap that corresponds to your website irrespective of whether it is a subdomain or a root domain

7 – What is the best solution to generate sitemap for multiple sites using Domain Access module?

We were using Domain Access module for multisiting.
For generating sitemap.xml file for we have applied a module. It has a manual that we have already applied.

Currently wesite shows on /admin/config/search/xmlsitemap that sitemaps are created.

enter image description here
But while visiting domainname/sitemap.xml we are getting 404 error. Any workaround needed to sitamaps to be shown?

Any suggestions and help appreciated. Thanks.

mobile application – What are the best tools for creating a Sitemap?

I am working on a mobile app and I have drawn a sitemap for it on paper. However, I want to to create a digital version of the sitemap. I have used Adobe Illustrator previously to create diagrams like this; however I have found that creating such diagrams on Illustrator is very time-consuming. I am looking for a faster way to create sitemaps.

I was wondering if anyone had any recommendations on any good free online tools for creating sitemaps? Any insights are appreciated.

seo – How to generate sitemap for angular web application

I am developing a website using MEAN stack technology. So I want to create a sitemap for all urls present in webpage and also video and image sitemap. Website is dynamic and there will surely be more links in future and more images and audios as well. So can you help me in creating a dynamic sitemap so that it does not require to update for each and every link, image and video.
You can share your any idea regarding sitemap generation. Because I have zero knowledge in sitemap generation. Any help will be appreciated.

8 – How do I debug/kint Simple XML Sitemap link array?

I am trying to find how I can debug/kint a variable/array from the Simple XML Sitemap?

I worked through the documentation here: https://www.drupal.org/docs/8/modules/simple-xml-sitemap/api-and-extending-the-module#s-api-hooks to find the hook I need.

My goal is to unset any links that have node/ to remove published, but un-aliased nodes of included content types.

The array key ('path') looks to be the unaliased URL and the below code removes all links except the home page. I am unsure how I can kint($link) in this function so I can see what other array keys are available to see what else I may use for comparison.

function HOOK_simple_sitemap_links_alter(array &$links, $sitemap_variant) {

  foreach ($links as $key => $link) {
    if (strpos($link('meta')('path'), 'node/') !== FALSE) {
      unset($links($key));
    }
  }
}

Is there a way to kint() these sitemap arrays? Or maybe some documentation that shows the structure of these arrays?

Do you ping Google when you update your sitemap?

When I update my website, I can see in access logs, that Google downloads my sitemap within 48 hours. If I use something like this – https://submit-sitemap.com/ – I can see in my logs immediately, that Google has downloaded my sitemap.

May I be penalized for pinging Google for pinging them too much?

Google Search Console coverage reports "Submitted URL marked ‘noindex’" for a 404 page not in the sitemap without a noindex tag

Error: "Submitted URL marked ‘noindex’" error in the Google Search Console coverage report

Situation

  • URL is not submitted in our sitemap
  • Google is not indexing the URL
  • There is no noindex tag on the page
  • The URL is not disallowed in robots.txt
  • Can’t find the file on server and it returns a 404 on front-end
  • Resubmitted the sitemap and the error persists

Has anyone encountered this situation? How can this be fixed?