Retention mapping¶

Lab objectives¶

  • Understand accost space mapping mechanisms
  • Acquire about the most important structures related to memory management

Keywords:

  • accost space
  • mmap()
  • struct page
  • struct vm_area_struct
  • struct vm_struct
  • remap_pfn_range
  • SetPageReserved()
  • ClearPageReserved()

Overview¶

In the Linux kernel it is possible to map a kernel address space to a user accost infinite. This eliminates the overhead of copying user space information into the kernel space and vice versa. This can be done through a device commuter and the user space device interface ( /dev ).

This feature can be used past implementing the mmap() functioning in the device driver's struct file_operations and using the mmap() system phone call in user space.

The basic unit for virtual retention direction is a page, which size is usually 4K, but it can be upwards to 64K on some platforms. Whenever we work with virtual memory nosotros work with two types of addresses: virtual accost and physical address. All CPU access (including from kernel space) uses virtual addresses that are translated past the MMU into concrete addresses with the assist of page tables.

A physical folio of memory is identified by the Page Frame Number (PFN). The PFN can be hands computed from the physical address by dividing it with the size of the page (or by shifting the physical address with PAGE_SHIFT bits to the right).

../_images/paging.png

For efficiency reasons, the virtual address space is divided into user space and kernel space. For the aforementioned reason, the kernel infinite contains a memory mapped zone, called lowmem, which is contiguously mapped in physical retentiveness, starting from the lowest possible physical address (usually 0). The virtual address where lowmem is mapped is defined by PAGE_OFFSET .

On a 32bit system, not all bachelor memory can exist mapped in lowmem and because of that there is a separate zone in kernel infinite called highmem which can be used to arbitrarily map physical memory.

Memory allocated by kmalloc() resides in lowmem and it is physically contiguous. Memory allocated past vmalloc() is not face-to-face and does not reside in lowmem (it has a defended zone in highmem).

../_images/kernel-virtmem-map.png

Structures used for memory mapping¶

Before discussing the mechanism of retentiveness-mapping a device, we will present some of the basic structures related to the memory management subsystem of the Linux kernel.

Before discussing nigh the memory mapping mechanism over a device, we will present some of the basic structures used by the Linux retentiveness management subsystem. Some of the basic structures are: struct page , struct vm_area_struct , struct mm_struct .

struct folio

struct page is used to embed information about all physical pages in the arrangement. The kernel has a struct page structure for all pages in the system.

There are many functions that interact with this construction:

  • virt_to_page() returns the page associated with a virtual accost
  • pfn_to_page() returns the page associated with a folio frame number
  • page_to_pfn() render the page frame number associated with a struct page
  • page_address() returns the virtual accost of a struct folio ; this functions can be chosen only for pages from lowmem
  • kmap() creates a mapping in kernel for an arbitrary physical page (can exist from highmem) and returns a virtual address that tin be used to directly reference the page

struct vm_area_struct

struct vm_area_struct holds information about a contiguous virtual retentivity expanse. The retentiveness areas of a procedure can be viewed by inspecting the maps aspect of the process via procfs:

                    root@qemux86:~# cat /proc/1/maps                    #address          perms offset  device inode     pathname                    08048000-08050000 r-xp                    00000000                    fe:00                    761                    /sbin/init.sysvinit                    08050000-08051000 r--p                    00007000                    fe:00                    761                    /sbin/init.sysvinit                    08051000-08052000 rw-p                    00008000                    fe:00                    761                    /sbin/init.sysvinit 092e1000-09302000 rw-p                    00000000                    00:00                    0                    [heap]                    4480c000-4482e000 r-xp                    00000000                    fe:00                    576                    /lib/ld-ii.25.then 4482e000-4482f000 r--p                    00021000                    fe:00                    576                    /lib/ld-ii.25.and so 4482f000-44830000 rw-p                    00022000                    fe:00                    576                    /lib/ld-2.25.so                    44832000-449a9000 r-xp                    00000000                    fe:00                    581                    /lib/libc-2.25.so 449a9000-449ab000 r--p                    00176000                    fe:00                    581                    /lib/libc-two.25.so 449ab000-449ac000 rw-p                    00178000                    atomic number 26:00                    581                    /lib/libc-ii.25.so 449ac000-449af000 rw-p                    00000000                    00:00                    0                    b7761000-b7763000 rw-p                    00000000                    00:00                    0                    b7763000-b7766000 r--p                    00000000                    00:00                    0                    [vvar]                    b7766000-b7767000 r-xp                    00000000                    00:00                    0                    [vdso]                    bfa15000-bfa36000 rw-p                    00000000                    00:00                    0                    [stack]                  

A retention area is characterized by a start accost, a stop accost, length, permissions.

A struct vm_area_struct is created at each mmap() call issued from user infinite. A driver that supports the mmap() performance must complete and initialize the associated struct vm_area_struct . The most important fields of this structure are:

  • vm_start , vm_end - the showtime and the end of the memory area, respectively (these fields also appear in /proc/<pid>/maps );
  • vm_file - the pointer to the associated file structure (if any);
  • vm_pgoff - the offset of the area within the file;
  • vm_flags - a set of flags;
  • vm_ops - a set up of working functions for this area
  • vm_next , vm_prev - the areas of the same process are chained by a list structure

struct mm_struct

struct mm_struct encompasses all memory areas associated with a process. The mm field of struct task_struct is a arrow to the struct mm_struct of the electric current procedure.

Device driver retentivity mapping¶

Retention mapping is one of the about interesting features of a Unix system. From a commuter's indicate of view, the memory-mapping facility allows straight memory access to a user space device.

To assign a mmap() operation to a driver, the mmap field of the device commuter's struct file_operations must be implemented. If that is the example, the user space procedure can then use the mmap() organization call on a file descriptor associated with the device.

The mmap system telephone call takes the post-obit parameters:

                                    void                  *                  mmap                  (                  caddr_t                  addr                  ,                  size_t                  len                  ,                  int                  prot                  ,                  int                  flags                  ,                  int                  fd                  ,                  off_t                  offset                  );                

To map memory between a device and user space, the user process must open the device and issue the mmap() system call with the resulting file descriptor.

The device driver mmap() operation has the following signature:

                                    int                  (                  *                  mmap                  )(                  struct                  file                  *                  filp                  ,                  struct                  vm_area_struct                  *                  vma                  );                

The filp field is a pointer to a struct file created when the device is opened from user space. The vma field is used to point the virtual accost space where the retentiveness should be mapped by the device. A driver should classify memory (using kmalloc() , vmalloc() , alloc_pages() ) and so map information technology to the user address space as indicated by the vma parameter using helper functions such every bit remap_pfn_range() .

remap_pfn_range() will map a face-to-face physical accost space into the virtual space represented by vm_area_struct :

                                    int                  remap_pfn_range                  (                  structure                  vm_area_struct                  *                  vma                  ,                  unsigned                  long                  addr                  ,                  unsigned                  long                  pfn                  ,                  unsigned                  long                  size                  ,                  pgprot_t                  prot                  );                

remap_pfn_range() expects the following parameters:

  • vma - the virtual memory space in which mapping is made;
  • addr - the virtual address space from where remapping begins; page tables for the virtual address space betwixt addr and addr + size will be formed as needed
  • pfn - the folio frame number to which the virtual address should be mapped
  • size - the size (in bytes) of the memory to exist mapped
  • prot - protection flags for this mapping

Here is an example of using this function that contiguously maps the physical memory starting at page frame number pfn (memory that was previously allocated) to the vma->vm_start virtual address:

                                    struct                  vm_area_struct                  *                  vma                  ;                  unsigned                  long                  len                  =                  vma                  ->                  vm_end                  -                  vma                  ->                  vm_start                  ;                  int                  ret                  ;                  ret                  =                  remap_pfn_range                  (                  vma                  ,                  vma                  ->                  vm_start                  ,                  pfn                  ,                  len                  ,                  vma                  ->                  vm_page_prot                  );                  if                  (                  ret                  <                  0                  )                  {                  pr_err                  (                  "could not map the address expanse                  \n                  "                  );                  return                  -                  EIO                  ;                  }                

To obtain the page frame number of the physical memory we must consider how the memory allocation was performed. For each kmalloc() , vmalloc() , alloc_pages() , nosotros must used a unlike approach. For kmalloc() we can use something like:

                                    static                  char                  *                  kmalloc_area                  ;                  unsigned                  long                  pfn                  =                  virt_to_phys                  ((                  void                  *                  )                  kmalloc_area                  )                  >>                  PAGE_SHIFT                  ;                

while for vmalloc() :

                                    static                  char                  *                  vmalloc_area                  ;                  unsigned                  long                  pfn                  =                  vmalloc_to_pfn                  (                  vmalloc_area                  );                

and finally for alloc_pages() :

                                    struct                  page                  *                  page                  ;                  unsigned                  long                  pfn                  =                  page_to_pfn                  (                  page                  );                

Attention

Note that memory allocated with vmalloc() is not physically contiguous and so if we want to map a range allocated with vmalloc() , we take to map each page individually and compute the physical accost for each page.

Since the pages are mapped to user space, they might be swapped out. To avoid this we must fix the PG_reserved flake on the folio. Enabling is washed using SetPageReserved() while reseting it (which must be washed earlier freeing the memory) is done with ClearPageReserved() :

                                    void                  alloc_mmap_pages                  (                  int                  npages                  )                  {                  int                  i                  ;                  char                  *                  mem                  =                  kmalloc                  (                  PAGE_SIZE                  *                  npages                  );                  if                  (                  !                  mem                  )                  return                  mem                  ;                  for                  (                  i                  =                  0                  ;                  i                  <                  npages                  *                  PAGE_SIZE                  ;                  i                  +=                  PAGE_SIZE                  )                  SetPageReserved                  (                  virt_to_page                  (((                  unsigned                  long                  )                  mem                  )                  +                  i                  ));                  return                  mem                  ;                  }                  void                  free_mmap_pages                  (                  void                  *                  mem                  ,                  int                  npages                  )                  {                  int                  i                  ;                  for                  (                  i                  =                  0                  ;                  i                  <                  npages                  *                  PAGE_SIZE                  ;                  i                  +=                  PAGE_SIZE                  )                  ClearPageReserved                  (                  virt_to_page                  (((                  unsigned                  long                  )                  mem                  )                  +                  i                  ));                  kfree                  (                  mem                  );                  }                

Exercises¶

Important

To solve exercises, y'all need to perform these steps:

  • set up skeletons from templates
  • build modules
  • copy modules to the VM
  • beginning the VM and exam the module in the VM.

The current lab name is memory_mapping. See the exercises for the job name.

The skeleton code is generated from full source examples located in tools/labs/templates . To solve the tasks, start by generating the skeleton code for a complete lab:

                    tools/labs $ make clean tools/labs $                    LABS                    =<lab name> brand skels                  

You can also generate the skeleton for a unmarried task, using

                    tools/labs $                    LABS                    =<lab name>/<task name> make skels                  

One time the skeleton drivers are generated, build the source:

And then, copy the modules and get-go the VM:

                    tools/labs $ make copy tools/labs $ make kick                  

The modules are placed in /domicile/root/skels/memory_mapping/<task_name>.

Alternatively, we can re-create files via scp, in order to avoid restarting the VM. For additional details virtually connecting to the VM via the network, please check Connecting to the Virtual Car.

Review the Exercises department for more than detailed information.

Warning

Before starting the exercises or generating the skeletons, please run git pull inside the Linux repo, to make certain yous have the latest version of the exercises.

If you lot take local changes, the pull control will fail. Check for local changes using git status . If you desire to keep them, run git stash before pull and git stash pop later. To discard the changes, run git reset --hard master .

If y'all already generated the skeleton before git pull you will need to generate it again.

1. Mapping contiguous physical retentivity to userspace¶

Implement a device commuter that maps contiguous physical retentiveness (e.g. obtained via kmalloc() ) to userspace.

Review the Device commuter memory mapping section, generate the skeleton for the task named kmmap and fill in the areas marked with TODO i.

Start with allocating a NPAGES+ii memory area folio using kmalloc() in the module init office and discover the commencement address in the surface area that is aligned to a folio boundary.

Hint

The size of a page is PAGE_SIZE.

Store the allocated surface area in kmalloc_ptr and the folio aligned address in kmalloc_area:

Apply PAGE_ALIGN() to determine kmalloc_area.

Enable the PG_reserved scrap of each page with SetPageReserved() . Articulate the chip with ClearPageReserved() earlier freeing the retentivity.

Hint

Use virt_to_page() to translate virtual pages into concrete pages, as required by SetPageReserved() and ClearPageReserved() .

For verification purpose (using the test below), fill in the kickoff 4 bytes of each folio with the post-obit values: 0xaa, 0xbb, 0xcc, 0xdd.

Implement the mmap() driver function.

Hint

For mapping, use remap_pfn_range() . The third statement for remap_pfn_range() is a page frame number (PFN).

To convert from virtual kernel address to physical address, employ virt_to_phys() .

To catechumen a physical accost to its PFN, shift the address with PAGE_SHIFT bits to the right.

For testing, load the kernel module and run:

                    root@qemux86:~# skels/memory_mapping/test/mmap-exam                    1                  

If everything goes well, the test volition show "matched" letters.

two. Mapping non-contiguous physical memory to userspace¶

Implement a device driver that maps non-contiguous concrete memory (due east.g. obtained via vmalloc() ) to userspace.

Review the Device driver retention mapping department, generate the skeleton for the job named vmmap and fill in the areas marked with TODO i.

Allocate a memory area of NPAGES with vmalloc() .

Hint

The size of a page is PAGE_SIZE. Store the allocated area in vmalloc_area. Retentiveness allocated by vmalloc() is paged aligned.

Enable the PG_reserved bit of each page with SetPageReserved() . Clear the bit with ClearPageReserved() before freeing the memory.

Hint

Use vmalloc_to_page() to interpret virtual pages into physical pages used by the functions SetPageReserved() and ClearPageReserved() .

For verification purpose (using the test beneath), make full in the first iv bytes of each page with the following values: 0xaa, 0xbb, 0xcc, 0xdd.

Implement the mmap driver function.

Hint

To convert from virtual vmalloc address to concrete address, apply vmalloc_to_pfn() which returns a PFN directly.

Attention

vmalloc pages are not physically contiguous so information technology is needed to use remap_pfn_range() for each page.

Loop through all virtual pages and for each: * make up one's mind the physical address * map it with remap_pfn_range()

Make sure that you determine the physical accost each fourth dimension and that you lot apply a range of ane page for mapping.

For testing, load the kernel module and run:

                    root@qemux86:~# skels/memory_mapping/test/mmap-test                    1                  

If everything goes well, the exam will show "matched" messages.

iii. Read / write operations in mapped memory¶

Modify one of the previous modules to allow read / write operations on your device. This is a didactic exercise to come across that the aforementioned space can besides be used with the mmap() telephone call and with read() and write() calls.

Fill in areas marked with TODO 2.

Notation

The offset parameter sent to the read / write operation tin can exist ignored equally all reads / writes from the test plan will be done with 0 offsets.

For testing, load the kernel module and run:

                    root@qemux86:~# skels/memory_mapping/test/mmap-exam                    2                  

four. Brandish memory mapped in procfs¶

Using one of the previous modules, create a procfs file in which you display the total memory mapped by the calling procedure.

Make full in the areas marked with TODO iii.

Create a new entry in procfs ( PROC_ENTRY_NAME , defined in mmap-test.h ) that will show the total memory mapped past the procedure that chosen the read() on that file.

Hint

Utilise proc_create() . For the mode parameter, utilise 0, and for the parent parameter use Zip. Use my_proc_file_ops() for operations.

In the module exit function, delete the PROC_ENTRY_NAME entry using remove_proc_entry() .

Notation

A (complex) employ and description of the struct seq_file interface tin can be plant hither in this example .

For this exercise, merely a elementary utilise of the interface described here is sufficient. Check the "extra-uncomplicated" API described there.

In the my_seq_show() function you will demand to:

  • Obtain the struct mm_struct construction of the current procedure using the get_task_mm() function.

    Hint

    The current process is available via the current variable of type struct task_struct* .

  • Iterate through the entire struct vm_area_struct list associated with the procedure.

    Hint

    Utilize the variable vma_iterator and offset from mm->mmap . Apply the vm_next field of the struct vm_area_struct to navigate through the list of retentivity areas. Stop when you reach NULL .

  • Use vm_start and vm_end for each area to compute the total size.

  • Use pr_info("%lx %lxn, ...)() to print vm_start and vm_end for each surface area.

  • To release struct mm_struct , decrement the reference counter of the structure using mmput() .

  • Employ seq_printf() to write to the file. Testify but the total count, no other messages. Do not fifty-fifty show newline (n).

In my_seq_open() register the display function ( my_seq_show() ) using single_open() .

Note

single_open() can employ NULL equally its 3rd argument.

For testing, load the kernel module and run:

                    root@qemux86:~# skels/memory_mapping/exam/mmap-exam                    3                  

Annotation

The test waits for a while (information technology has an internal sleep instruction). As long as the test waits, use the pmap command in some other console to see the mappings of the test and compare those to the examination results.