Google Groups Home
Help | Sign in
Past-pm basic string types
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Allison Randal  
View profile
 More options Dec 12 2006, 12:43 pm
Newsgroups: perl.perl6.internals
From: alli...@kasuku.org (Allison Randal)
Date: Tue, 12 Dec 2006 09:43:39 -0800
Local: Tues, Dec 12 2006 12:43 pm
Subject: Past-pm basic string types
Patrick, what's the best way to pass-through string types from a
compiler to Parrot without doing full string processing? To pass the
current tests, Punie only needs Parrot's single- and double-quoted
strings, but Past-pm is escaping them. So:

   print "\n";

reaches the PIR translation as:

   print "\\n"

(I will add full string processing to Punie later, but since other
compilers will also need basic Parrot string types, it makes sense to
figure it out now.)

Allison


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Patrick R. Michaud  
View profile
 More options Dec 12 2006, 4:20 pm
Newsgroups: perl.perl6.internals
From: pmich...@pobox.com (Patrick R. Michaud)
Date: Tue, 12 Dec 2006 15:20:26 -0600
Local: Tues, Dec 12 2006 4:20 pm
Subject: Re: Past-pm basic string types

On Tue, Dec 12, 2006 at 09:43:39AM -0800, Allison Randal wrote:
> Patrick, what's the best way to pass-through string types from a
> compiler to Parrot without doing full string processing? To pass the
> current tests, Punie only needs Parrot's single- and double-quoted
> strings, but Past-pm is escaping them.

PAST-pm expects it to be pretty rare that a HLL's string literal
format will exactly match what works as a string literal in PIR, so
PAST::Val nodes expect the HLL to have already decoded the string
constant according to whatever rules the HLL uses.  Then PAST-pm
can re-encode the string into a form that is guaranteed to work
in Parrot (even handling things such as placing "unicode:" in
front of PIR string literals if the string has characters that
fall outside of the ASCII range.)

I can modify PAST-pm to provide a "send exactly this string to PIR"
option for PAST::Val.  More generally useful would seem to be to
provide a generic function or opcode that can decode single/double
quoted strings according to PIR's encoding rules, and then use
that to get the string into PAST::Val.

PGE::Text could provide such a feature as part of its library-- i.e.,
subrules like:

    " <PGE::Text::pir_quoted_string: "> "
    ' <PGE::Text::pir_quoted_string: '>  '

could parse a valid pir string literal and provide the
decoded value as the result object.

> (I will add full string processing to Punie later, but since other
> compilers will also need basic Parrot string types, it makes sense to
> figure it out now.)

I think that the various languages have enough differences in
string literal handling that each compiler will end up writing
its own string literal decoder.  (Or we need a semi-powerful library
to handle the many differences.)  In the meantime having an
easy-to-access subrule for "just pretend it's a quoted literal
according to PIR conventions" might be a good way for someone
wanting to bootstrap a compiler, without placing Parrot-specific
encodings into PAST-pm.

Lastly, I'm still working out the handling of HLL to Parrot
type mappings -- it's also possible that some of this will
fall out as a result of that.

Pm


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Allison Randal  
View profile
 More options Dec 12 2006, 4:57 pm
Newsgroups: perl.perl6.internals
From: alli...@kasuku.org (Allison Randal)
Date: Tue, 12 Dec 2006 13:57:20 -0800
Local: Tues, Dec 12 2006 4:57 pm
Subject: Re: Past-pm basic string types

Patrick R. Michaud wrote:

> PAST-pm expects it to be pretty rare that a HLL's string literal
> format will exactly match what works as a string literal in PIR, so
> PAST::Val nodes expect the HLL to have already decoded the string
> constant according to whatever rules the HLL uses.  Then PAST-pm
> can re-encode the string into a form that is guaranteed to work
> in Parrot (even handling things such as placing "unicode:" in
> front of PIR string literals if the string has characters that
> fall outside of the ASCII range.)

Agreed that this is a good general solution.

> I can modify PAST-pm to provide a "send exactly this string to PIR"
> option for PAST::Val.  

Yes, good idea for the simple case.

> More generally useful would seem to be to
> provide a generic function or opcode that can decode single/double
> quoted strings according to PIR's encoding rules, and then use
> that to get the string into PAST::Val.

That's a lot of extra work when all you need is for the string to pass
through to PIR exactly as it was parsed. So I'd skip this one.

> I think that the various languages have enough differences in
> string literal handling that each compiler will end up writing
> its own string literal decoder.  (Or we need a semi-powerful library
> to handle the many differences.)  In the meantime having an
> easy-to-access subrule for "just pretend it's a quoted literal
> according to PIR conventions" might be a good way for someone
> wanting to bootstrap a compiler, without placing Parrot-specific
> encodings into PAST-pm.

Ultimately we will want some general tools to assist compiler writers
with string handling. It really shouldn't be any more difficult than
writing the general grammar rules or operator precedence rules. Since
special string handling is something Perl 6 users are likely to need too
(pretty much all templating is customized string interpolation), it's
worth punting this to p6l to see if they come up with a nice interface.

In the mean time, a string decoder rule written in PIR is good enough.

> Lastly, I'm still working out the handling of HLL to Parrot
> type mappings -- it's also possible that some of this will
> fall out as a result of that.

Makes sense.

Allison


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Patrick R. Michaud  
View profile
 More options Dec 13 2006, 4:52 pm
Newsgroups: perl.perl6.internals
From: pmich...@pobox.com (Patrick R. Michaud)
Date: Wed, 13 Dec 2006 15:52:48 -0600
Local: Wed, Dec 13 2006 4:52 pm
Subject: Re: Past-pm basic string types

On Tue, Dec 12, 2006 at 01:57:20PM -0800, Allison Randal wrote:
> Patrick R. Michaud wrote:
> >I can modify PAST-pm to provide a "send exactly this string to PIR"
> >option for PAST::Val.  

> Yes, good idea for the simple case.

After sleeping on it overnight, I realized that PAST-pm already
has this feature.

Currently PAST-pm checks the PAST::Val node's "ctype"
attribute to decide whether to encode the literal value
as a Parrot form -- if the node doesn't have ctype that indicates
"string constant", then PAST-pm just uses the literal value
directly in the output.

So, just don't set "ctype", and whatever the node has as
its "name" attribute will go directly into the PIR output.

Here's an example:

    $ cat x.pir
    .sub main :main
        load_bytecode 'PAST-pm.pbc'
        .local pmc valnode, blocknode, pir

        ##  $S0 is the string we want to appear in the output
        $S0 = '"\n"'
        valnode = new 'PAST::Val'
        valnode.'init'('vtype'=>'.String', 'name'=>$S0)
        blocknode = valnode.'new'('PAST::Block', valnode, 'name'=>'anon')

        ##  compile the tree to PIR and print the result
        $P99 = compreg 'PAST'
        pir = $P99.'compile'(blocknode, 'target'=>'pir')
        print pir
    .end

    $ ./parrot x.pir

    .sub "anon"
        new $P10, .String
        assign $P10, "\n"
        .return ($P10)
    .end

Eventually the handling of "ctype" is going to change -- first,
the name will change to be more descriptive (but I'll leave a 'ctype'
accessor in place to give compilers time to switch); second, any
ctype specifications will be held in a HLL class mapping table
instead of in each PAST::Val node.

There is a good chance that PAST-pm will treat PAST::Val nodes
of type .String as needing their values to be encoded for Parrot,
but to protect against this punie (and other compilers) can
use .Undef:

        $S0 = '"\n"'
        valnode = new 'PAST::Val'
        valnode.'init'('vtype'=>'.Undef', 'name'=>$S0)

Since the node isn't a string type, PAST-pm will use the "name"
$S0 value directly in the output PIR without performing any
encoding on the literal value, and the generated PIR from the node
would look like

        new $P10, .Undef
        assign $P10, "\n"

And this does exactly what you want.  :-)

Pm


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google