Prevent japanese output as HTML from actual condition reference (see numeric characters)

Page creation date : Thursday, April 29, 2021

environment

Visual Studio

Visual Studio 2019

ASP.NET Core

3.1 (Razor page, MVC)

Japanese dynamically arranged in the program is output as a reference to the actual situation.

Index.cshtmlLet's display those that entered Japanese directly into and those that output Japanese using ViewData as follows.

<!-- 省略 -->

<div class="text-center">
    <h1 class="display-4">Welcome</h1>
    <p>Learn about <a href="https://docs.microsoft.com/aspnet/core">building Web apps with ASP.NET Core</a>.</p>
</div>

<p>ここに固定文字列の日本語を表示させます。</p>
<p>@ViewData["Message"]</p>

On the program side, set Japanese to ViewData.

Index.cshtml.cs (for Razor pages)

// 省略

public class IndexModel : PageModel
{
  // 省略

  public void OnGet()
  {
    ViewData["Message"] = "ViewData から日本語を表示させます。";
  }
}

HomeController .cs (for MVC)

// 省略

public class HomeController : Controller
{
  // 省略

  public IActionResult Index()
  {
    ViewData["Message"] = "ViewData から日本語を表示させます。";
  }
}

When I run debugging, both are displayed in Japanese correctly.

However, if you look at the source of the page in a Web browser, you can see that the Japanese output in ViewData is output with a real-world reference (see numeric characters).

A real-world reference (see numeric characters) is an alphanumeric alternative character used when you want to represent Unicode characters in an environment that can only process alphanumeric characters or display only certain languages. For example, the letter "a" is "あ" in the actual situation reference (see #x3042 characters). it will be displayed as .

A typical Web browser encodes and displays characters correctly even if html contains real-world references. Users are often unaware of the actual situation references.

Basically, it is displayed correctly in a Web browser, so there is no problem as it is. Since the number of characters output increases and it is difficult to read when looking at HTML during development, it can be output in Japanese as it is in the following way.

Prevent Japanese from being printed with real-world references (see numeric characters)

To start.cs just add the code to the file as follows:

// 追加
using Microsoft.Extensions.WebEncoders;
using System.Text.Encodings.Web;
using System.Text.Unicode;

// 省略

public class Startup
{
  // 省略

  // このメソッドはランタイムによって呼び出されます。 このメソッドを使用して、コンテナーにサービスを追加します。
  public void ConfigureServices(IServiceCollection services)
  {
    // 省略

    // 全ての文字をが実態参照で出力されないようにする
    // 全てではなく特定の範囲のみをエンコードさせたくない場合は UnicodeRanges.All のプロパティを個別に設定します。
    services.Configure<WebEncoderOptions>(options =>
    {
      options.TextEncoderSettings = new TextEncoderSettings(UnicodeRanges.All);

      // 個別に設定する場合 (例)
      //options.TextEncoderSettings = new TextEncoderSettings(UnicodeRanges.Hiragana, UnicodeRanges.Katakana);
    });
  }

  // 省略
}

Startup.ConfigureServices In the services.Configure<WebEncoderOptions> method, call the method, options.TextEncoderSettings Set TextEncoderSettings an instance of to .

If you set UnicodeRanges.All the argument, all characters will not be converted to real-world references and will be output as is.

If you specify individual ranges that you do not want UnicodeRanges.All to convert, you will specify one or more other values. However, basically, unless you have a specific UnicodeRanges.All reason, you may want to specify .

If you actually run it and confirm it, you can see that Japanese is output directly without referring to the actual situation.

Lowercase all URLs for actions and page transitions

Reflect the value changed in the action after post of the form in the view