Google Groups Home
Help | Sign in
overview
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 55 - Collapse all   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Eric W. Biederman  
View profile
 More options Jan 19 2005, 4:10 am
Newsgroups: linux.kernel
From: "Eric W. Biederman" <ebied...@xmission.com>
Date: Wed, 19 Jan 2005 10:10:11 +0100
Local: Wed, Jan 19 2005 4:10 am
Subject: [PATCH 0/29] overview

Andrew the following patchset is against 2.6.11-rc1-mm1 with
all of the kexec patches removed.  The list of removed patches
is included below.

This patchsset is a major refresh of the kexec on panic
functionality in the kernel.  The primary aim of which was to take
the requirements capture of the kernel crashdump patches and
start integrating the functionality cleanly into the kexec
patches.  

Major accomplishments:
- Compat syscall support has been added.
- The crashdump capture code has been separated from the kexec on panic code.
- The kernel to jump to on panic is now loaded in place.
- A long standing bug that allowed 2 sources pages to copy data
  to a single destination page has been caught and fixed.
- Support for loading an x86_64 kernel in a reserved of memory has been completed.

The crashdump code is currently slightly broken.  I have attempted to
minimize the breakage so things can quick be made to work again.

With respect to a final design discussion there are two remaining
open issues.  The first is how little hardware shutdown we can get away
with in the kernel that is panicing.  I believe we can reduce this
to a simply NMI to the other cpus telling them to stop.  This has
been address as a major concern in previous conversations.

The second is an issue is the most significant with respect to the
design of a kernel based crash dump capture implementation.  How does
the crashdump capture process discover relevant information about the
kernel that just crashed?  There are two options.

1) As represented by the current crashdump patches the crashdump
   kernel and the kernel in which it loads are kept in sync so that
   it has uptodate versions all of crashed kernels data structures
   because it is built from the same source.  So it only needs to
   find the address of the data structures it would like to look at.

2) The relevant information if it is available when sys_kexec_load
   is called is exported to user space, or the machine_crash_shutdown
   method marshalls what little information must be captured when the
   machine dies in a well known standard format (most likely ELF
   notes).  Allowing the crashdump capture process to simply pass
   on the information or utilize it as appropriate.

   If the second method can successfully represent all of the
   interesting information then we can allow kernel version
   skew, between the two kernels, and potentially implement
   the entire crash dump capture process in user space.

As best as I have been able to discover the interesting information
includes.  The cpu state (registers) at the time of the crash/panic.
The list of memory regions the kernel that has crashed was using.
And potentially the list of pages dedicated to kernel data as opposed
to user space, so the the people with insane amounts of memory (1TB+)
don't require unmanagely large core files.

Andrew earlier when asked about the possibility of merghing the kexec
on panic code you said:

> I don't want us to be in a position of merging all that code and then
> finding out that it cannot be made to work "sufficiently well",
> forcing us to revert it and find a new crashdump solution.  You guys
> know far better than I when we will reach that threshold.  If the
> kexec/dump developers can say "yup, this is going to work (because X)"
> then I'm happy.

So here is my subjective view.
- This code needs to sit in a development tree for a little while
  to shake out whatever bugs still linger from my massive refactoring.
- Through the kexec patches the code and design appears to be sound.
  Given that machine_kexec is little more than a jump there are few
  possible implementations that will be able to use it.  The only
  exception I can see are running special dump drivers from the kernel
  that crashed, and I believe no one thinks the that will work well.
- Once we finish sorting out the best way to get information out of
  the kernel that crashed I think we will have a complete architecture
  that is largely portable to any architecture.

In the interests of full disclosure my main interesting is using the
kernel as a bootloader for other kernels and that has been working
fairly for years now :)

Eric

# Patches to remove from 2.6.11-rc1-mm1 before applying this patchset:
#
assign_irq_vector-section-fix.patch
kexec-i8259-shutdowni386.patch
kexec-i8259-shutdown-x86_64.patch
kexec-apic-virtwire-on-shutdowni386patch.patch
kexec-apic-virtwire-on-shutdownx86_64.patch
kexec-ioapic-virtwire-on-shutdowni386.patch
kexec-apic-virt-wire-fix.patch
kexec-ioapic-virtwire-on-shutdownx86_64.patch
kexec-e820-64bit.patch
kexec-kexec-generic.patch
kexec-ide-spindown-fix.patch
kexec-ifdef-cleanup.patch
kexec-machine_shutdownx86_64.patch
kexec-kexecx86_64.patch
kexec-kexecx86_64-4level-fix.patch
kexec-kexecx86_64-4level-fix-unfix.patch
kexec-machine_shutdowni386.patch
kexec-kexeci386.patch
kexec-use_mm.patch
kexec-loading-kernel-from-non-default-offset.patch
kexec-loading-kernel-from-non-default-offset-fix.patch
kexec-enabling-co-existence-of-normal-kexec-kernel-and-panic-kernel.patch
kexec-ppc-support.patch
#kexec-kexecppc.patch
#
crashdump-documentation.patch
crashdump-memory-preserving-reboot-using-kexec.patch
crashdump-memory-preserving-reboot-using-kexec-fix.patch
kdump-config_discontigmem-fix.patch
crashdump-routines-for-copying-dump-pages.patch
crashdump-routines-for-copying-dump-pages-kmap-fiddle.patch
crashdump-kmap-build-fix.patch
crashdump-register-snapshotting-before-kexec-boot.patch
crashdump-elf-format-dump-file-access.patch
crashdump-linear-raw-format-dump-file-access.patch
crashdump-minor-bug-fixes-to-kexec-crashdump-code.patch
crashdump-cleanups-to-the-kexec-based-crashdump-code.patch
#
x86-rename-apic_mode_exint.patch
x86-local-apic-fix.patch
#
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Reserving backup region for kexec based crashdumps." by Vivek Goyal
Vivek Goyal  
View profile
 More options Jan 21 2005, 2:10 am
Newsgroups: linux.kernel
From: Vivek Goyal <vgo...@in.ibm.com>
Date: Fri, 21 Jan 2005 08:10:12 +0100
Local: Fri, Jan 21 2005 2:10 am
Subject: [PATCH] Reserving backup region for kexec based crashdumps.

Hi Andrew,

Following patch is against 2.6.11-rc1-mm2.

As mentioned by following note from Eric, crashdump code is currently
broken.

> The crashdump code is currently slightly broken.  I have attempted to
> minimize the breakage so things can quick be made to work again.

We have started doing changes to make crashdump up and running again.
Following are few identified items to be done.

1. Reserve the backup region (640k) during kernel bootup.
2. Copy the data to backup region during crash.(moved to kexec user
space code, patch posted in separate mail)
3. Prepare elf headers while loading kexec panic kernel and store in
reserved memory area.
4. Pass required information to crashdump kernel, which parses it and
exports through /proc/vmcore. (may be user space utility, open to
discussion)

Following patch implements item 1) in the list. Soon we shall be rolling
out the patches for rest.

Thanks
Vivek

  crashdump-x86-reserve-640k-memory.patch
3K Download

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[PATCH] Reserving backup region for kexec based crashdumps." by Eric W. Biederman
Eric W. Biederman  
View profile
 More options Jan 21 2005, 3:10 am
Newsgroups: linux.kernel
From: ebied...@xmission.com (Eric W. Biederman)
Date: Fri, 21 Jan 2005 09:10:10 +0100
Local: Fri, Jan 21 2005 3:10 am
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.

Vivek Goyal <vgo...@in.ibm.com> writes:
> Hi Andrew,

> Following patch is against 2.6.11-rc1-mm2.

> As mentioned by following note from Eric, crashdump code is currently
> broken.

> > The crashdump code is currently slightly broken.  I have attempted to
> > minimize the breakage so things can quick be made to work again.

> We have started doing changes to make crashdump up and running again.
> Following are few identified items to be done.

> 1. Reserve the backup region (640k) during kernel bootup.

Why do we need a separate region for this?

It should be simple enough to take 640 out of the area kexec reserves
for the crash dump kernel.  That is what the previous code implemented.

> 2. Copy the data to backup region during crash.(moved to kexec user
> space code, patch posted in separate mail)

Thanks by and large it looks sane, it won't work yet the but it is
moving in the right direction.

> +++ linux-2.6.11-rc1-mm2-kexec-eric-root/include/linux/kexec.h 2005-01-20
> 13:55:33.000000000 +0530

> @@ -79,7 +79,7 @@ struct kimage {
>    unsigned long control_page;

>    /* Flags to indicate special processing */
> -  int type : 1;
> +  unsigned int type : 1;

That looks like a sane bug fix.  Having values of 0 and -1 is quite what
I was expecting...

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[PATCH] Reserving backup region for kexec based crashdumps." by Vivek Goyal
Vivek Goyal  
View profile
 More options Jan 21 2005, 5:20 am
Newsgroups: linux.kernel
From: Vivek Goyal <vgo...@in.ibm.com>
Date: Fri, 21 Jan 2005 11:20:25 +0100
Local: Fri, Jan 21 2005 5:20 am
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.

Previous code also reserved the backup memory region after crash kernel
region. It is just a matter of interpretation. What I understand that
crash kernel reserved region represents something where one can load the
panic kernel directly and new kernel can use this memory region for
memory allocation.

I don't want to steal the backup region from crash kernel region
otherwise, I shall have to boot the crash kernel with some strange
values like memmap=(32M-640k)@16M (symbolically) to prevent crash kernel
overwriting backup region. Why to make user aware of location of backup
region.

Alternatively, this can be managed by reserving this backup region again
in crash kernel to avoid any stomping. May be pass backup region
location to new kernel through parameter segment or through command line
but don't see a strong reason for doing that.

Thanks
Vivek

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[PATCH] Reserving backup region for kexec based crashdumps." by Eric W. Biederman
Eric W. Biederman  
View profile
 More options Jan 21 2005, 6:20 am
Newsgroups: linux.kernel
From: ebied...@xmission.com (Eric W. Biederman)
Date: Fri, 21 Jan 2005 12:20:09 +0100
Local: Fri, Jan 21 2005 6:20 am
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.

On deeper review your patch as it stands is incomplete.  In particular
you don't provide a way to either hardcode or dynamically set
the area you are attempt to reserve to hold the backup region.

Vivek Goyal <vgo...@in.ibm.com> writes:
> On Fri, 2005-01-21 at 13:24, Eric W. Biederman wrote:
> > Why do we need a separate region for this?

> > It should be simple enough to take 640 out of the area kexec reserves
> > for the crash dump kernel.  That is what the previous code implemented.

> Previous code also reserved the backup memory region after crash kernel
> region. It is just a matter of interpretation. What I understand that
> crash kernel reserved region represents something where one can load the
> panic kernel directly and new kernel can use this memory region for
> memory allocation.

Yes the reservation is a hunk of memory reserved for use by the crashdump
process, or whatever happens after panic.  It is up to the loaded code
to define how that memory is used.  purgatory.ro is a legitimate part
of that loaded code.

> I don't want to steal the backup region from crash kernel region
> otherwise, I shall have to boot the crash kernel with some strange
> values like memmap=(32M-640k)@16M (symbolically) to prevent crash kernel
> overwriting backup region. Why to make user aware of location of backup
> region.

Making the user aware of the region makes it one more thing for the user
to be aware of and to manually manage.  Based on what was passed as
crashkernel=...  We should be able to automate all of the rest of it.
So a weird memmap= line should not be hard.

I will have to wait and see but it would not surprise me if we settled
on a fixed address per architecture for the reservation to make it
easier for various users.

On that note we probably want to move the magic that we are doing
for crashdumps into the linux loader (i.e. x86-linux-setup.c ) in
kexec-tools, as most of these pieces are specific to taking a
crashdump with linux.  Not that I expect we will be doing it with
anything else but...

 > Alternatively, this can be managed by reserving this backup region again

> in crash kernel to avoid any stomping. May be pass backup region
> location to new kernel through parameter segment or through command line
> but don't see a strong reason for doing that.

Probably the biggest reason for doing it in one reservation is that
it happens to be an implementation detail of the crashdump capture
kernel.  If that kernel is not SMP I believe you can safely leave the
first 640k alone.  I know at least one other effort has had success in
that area.

In general it is not good to make unnecessary implementation details
between two pieces of software be part of their interface.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[PATCH] Reserving backup region for kexec based crashdumps." by Vivek Goyal
Vivek Goyal  
View profile
 More options Jan 23 2005, 4:30 am
Newsgroups: linux.kernel
From: Vivek Goyal <vgo...@in.ibm.com>
Date: Sun, 23 Jan 2005 10:30:13 +0100
Local: Sun, Jan 23 2005 4:30 am
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.

On Fri, 2005-01-21 at 16:43, Eric W. Biederman wrote:
> On deeper review your patch as it stands is incomplete.  In particular
> you don't provide a way to either hardcode or dynamically set
> the area you are attempt to reserve to hold the backup region.

Well. Here is the new patch. This one steals the 640k from top of memory
region reserved for crash kernel.

A new command line parameter (crashbackup=) has been introduced for
crash dump kernels. This parameter specifies the location of backup
region from where to retrieve the backup data.

Thanks
Vivek

  crashdump-x86-reserve-640k-memory.patch
7K Download

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[PATCH] Reserving backup region for kexec based crashdumps." by Andrew Morton
Andrew Morton  
View profile
 More options Jan 26 2005, 9:20 pm
Newsgroups: linux.kernel
From: Andrew Morton <a...@osdl.org>
Date: Thu, 27 Jan 2005 03:20:09 +0100
Local: Wed, Jan 26 2005 9:20 pm
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.
ebied...@xmission.com (Eric W. Biederman) wrote:

> There is evil intermingling and false dependency sharing between
>  the dying kernel and the crash capture kernel in this patch,

yikes!  I'll drop it from -mm while we have a rethink.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eric W. Biederman  
View profile
 More options Jan 26 2005, 10:20 pm
Newsgroups: linux.kernel
From: ebied...@xmission.com (Eric W. Biederman)
Date: Thu, 27 Jan 2005 04:20:09 +0100
Local: Wed, Jan 26 2005 10:20 pm
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.

Right now I am very frustrated with reviewing any of the crashdump
patches.  When I make comments usually things change just enough that
what I said is addressed but things are addressed very much at
a surface level.  Which means that if I think any kind of substantial
change is needed the only way I seem to be able to communicate
that is by actually implementing it myself.

Code that works today is great it does manages the job of requirements
capture.   But just throwing code together when you are dealing
with fundamental interface boundaries is not a good way to build
a sustainable design.  And with the crashdump code I want an
interface that is at least as simple and as stable as the syscall
interface.

At the very least if a patch is just a snapshot of your development
process up for comment and you are going to continue on making
headway please say as much.  If I know the code is quite possibly
going to change in some pretty fundamental ways I can stop worrying
about it.  This patch is certainly nothing I would want for more
than a couple of day hack, in my personal development tree.

I will try once again...

There is evil intermingling and false dependency sharing between
the dying kernel and the crash capture kernel in this patch, and
virtually all of the code is unnecessary.  I have already addressed
why.

Vivek Goyal <vgo...@in.ibm.com> writes:
> On Fri, 2005-01-21 at 16:43, Eric W. Biederman wrote:
> > On deeper review your patch as it stands is incomplete.  In particular
> > you don't provide a way to either hardcode or dynamically set
> > the area you are attempt to reserve to hold the backup region.

> Well. Here is the new patch. This one steals the 640k from top of memory
> region reserved for crash kernel.

> A new command line parameter (crashbackup=) has been introduced for
> crash dump kernels. This parameter specifies the location of backup
> region from where to retrieve the backup data.

What is wrong with user space doing all of the extra space
reservation?

Could you send this fairly obvious kexec fix, as a separate patch?

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "[PATCH] Reserving backup region for kexec based crashdumps." by Vivek Goyal
Vivek Goyal  
View profile
 More options Jan 27 2005, 8:00 am
Newsgroups: linux.kernel
From: Vivek Goyal <vgo...@in.ibm.com>
Date: Thu, 27 Jan 2005 14:00:16 +0100
Local: Thurs, Jan 27 2005 8:00 am
Subject: Re: [Fastboot] [PATCH] Reserving backup region for kexec based crashdumps.
Hi Eric,

It looks like we are looking at things a little differently. I
see a portion of the picture in your mind, but obviously not
entirely.

Perhaps, we need to step back and iron out in specific terms what
the interface between the two kernels should be in the crash dump
case, and the distribution of responsibility between kernel, user space
and the user.

[BTW, the patch was intended as a step in development up for
comment early enough to be able to get agreement on the interface
and think issues through to more completeness before going
too far. Sorry, if that wasn't apparent.]

When you say "evil intermingling", I'm guessing you mean the
"crashbackup=" boot parameter ? If so, then yes, I agree it'd
be nice to find a way around it that doesn't push hardcoding
elsewhere.

Let me explain the interface/approach I was looking at.

1.First kernel reserves some area of memory for crash/capture kernel as
specified by crashkernel=X@Y boot time parameter.

2.First kernel marks the top 640K of this area as backup area. (If
architecture needs it.) This is sort of a hardcoding and probably this
space reservation can be managed from user space as well as mentioned by
you in this mail below.

3. Location of backup region is exported through /proc/iomem which can
be read by user space utility to pass this information to purgatory code
to determine where to copy the first 640K.

Note that we do not make any additional reservation for the
backup region. We carve this out from the top of the already
reserved region and export it through /proc/iomem so that
the user space code and the capture kernel code need not
make any assumptions about where this region is located.

4. Once the capture kernel boots, it needs to know the location of
backup region for two purposes.

a. It should not overwrite the backup region.

b. There needs to be a way for the capture tool to access the original
   contents of the backed up region

Boot time parameter crashbackup=A@B has been provided to pass this
information to capture kernel. This parameter is valid only for capture
kernel and becomes effective only if CONFIG_CRASH_DUMP is enabled.

> What is wrong with user space doing all of the extra space
> reservation?

Just for clarity, are you suggesting kexec-tools creating an additional
segment for the backup region and pass the information to kernel.

There is no problem in doing reservation from user space except
one. How does the user and in-turn capture kernel come to know the
location of backup region, assuming that the user is going to provide
the exactmap for capture kernel to boot into.

Just a thought, is it  a good idea for kexec-tools to be creating and
passing memmap parameters doing appropriate adjustment for backup
region.

I had another question. How is the starting location of elf headers
communicated to capture tool? Is parameter segment a good idea? or
some hardcoding?

Another approach can be that backup area information is encoded in elf
headers and capture kernel is booted with modified memmap (User gets
backup region information from /proc/iomem) and capture tool can
extract backup area information from elf headers as stored by first
kernel.

Could you please elaborate a little more on what aspect of your view
differs from the above.

Thanks
Vivek

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at